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Motivation  for  the  Research 


It  is  widely  recognized  that  the  cost  of  software  is  far  outstripping  that  of  hardware, 
and  that  software  repair,  improvement,  and  enhancement  typically  consume  the  major 
portion  of  the  cost.  This  is  true  even  in  light  of  recent  advances  in  languages  and 
tools,  which  are  more  than  offset  by  the  increasing  size  and  complexity  of  software 
systems. 

There  are  several  reasons  for  the  high  cost  of  modifications.  One  is  that  system 
requirements  may  be  wrong  or  imprecise. -Research  on  rapid  prototyping  is  expected 
to  help  in  this  regard.  Another  reason  is  that  a  system  implementation  may  not  meet 
its  requirements.  Approaches  to  this  problem  include  testing,  formal  verification,  and 
runtime  assertion  checking. 

However,  the  dominant  source  of  cost  is  system  modification,  which  is  necessitated 
by  changes  in  requirements  and  in  the  support  environment.  If  the  design  or  the 
implementation  of  a  large  system  is  changed,  the  incremental  cost  of  an  individual 
change  can  be  unacceptably  high  because  a  seemingly  minor  change  to  one  part  of  a 
system  can  have  unforeseen  and  subtle  consequences  in  another  part. 

Some  changes  always  will  have  far-reaching  effects.  The  root  cause  is  the  imple¬ 
mentation  goal  of  good  performance  which  usually  dominates  and  conflicts  with  the 
goals  of  maintaining  clarity  and  structure.  In  this  research,  we  have  devised  formal 
techniques  that  can  substantially  reduce  the  cost  of  modifying  large  systems,  espe¬ 
cially  those  systems  that  have  been  optimized  for  performance.  The  techniques  have 
been  implemented  and  apply  to  a  large  class  of  sequential  systems  containing  such 
objects  as  modules,  procedures,  and  variables. 

The  ways  in  which  objects  can  be  related  is  limited  at  present,  reducing  design  and 
implementation  flexibility.  We  believe  that  our  results  can  be  generalized  to  handle 
powerful  parameterization  mechanisms,  object-oriented  paradigms,  and  concurrency. 
However,  this  has  not  yet  been  done.  . 

2  Our  Basic  Approach 

A  formal  system  development  involves  the  two  transitions  shown  in  Figure  1.  First, 
an  informal  description  of  requirements  is  transformed  into  a  formal  statement  of  what 
the  sytem  is  intended  to  do.  Second,  that  formal  specification  is  transformed  into  an 
implementation  that  describes  how  to  perform  the  specified  computation.  It  is  the 
what-to-how  transformation  that  is  the  subject  of  this  research.  Henceforth,  we  use 
the  terms  “structural  design"  and  “design"  to  refer  to  what-to-how  transformations. 

For  managing  the  evolution  of  a  system,  the  crucial  difference  between  a  specifica¬ 
tion  and  an  implementation  is  not  the  difference  in  concreteness.  It  is  the  complexity 
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Figure  1:  Steps  in  a  formal  system  development 

of  the  interconnections  among  system  objects.  An  implementation,  especially  an  ef¬ 
ficient  one,  will  invariably  be  highly  interconnected  and  somewhat  unstructured.  On 
the  other  hand,  performance  is  not  an  issue  in  specifications  and,  hence,  they  tend  to 
be  modular. 

Therefore,  our  approach  to  change  focuses  on  the  formal  documentation  and  anal¬ 
ysis  of  large-system  structures.  In  particular,  we  formally  record  the  structural  design 
of  a  system  during  its  development,  and  then  use  the  record  to: 

1.  Explain  system  organization.  A  formal  record  of  structural  design  decisions 
can  explain  how  abstract  objects  and  interactions  are  evolved  into  a  concrete 
implementation.  This  information  is  needed  to  make  changes  to  a  system. 

2.  Control  implementation  connectivity.  Structural  invariants  can  be  en¬ 
forced  automatically.  It  is  decidable  whether  or  not  a  structural  design  and  its 
implementation  are  consistent  under  certain  reasonable  assumptions. 

3.  Find  the  effects  of  changes.  If  the  structural  design  or  implementation  of  a 
system  is  changed,  new  bugs  can  be  introduced.  The  number  of  new  bugs  can 
be  greatly  reduced  if  developers  are  provided  with  an  accurate  assessment  of  the 
effects  of  a  planned  change.  The  semantic  effects  of  a  change  can  be  isolated  in 
a  system  design  or  implementation  thrc  gh  an  analysis  of  (i.e.,  proofs  about) 
its  structure. 

3  Summary  of  Main  Results 

Our  research  has  resulted  in  four  major  results: 

1.  The  first  formal  technique  for  specifying  implementation  structures.  Existing 
formal  specification  languages  can  describe  the  structure  of  a  specification,  but 
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not  the  intended  structure  of  an  implementation.  A  specification  of  implemen¬ 
tation  structure  is  needed  to  explain  and  enforce  structural  design  decisions, 
the  key  to  managing  complexity  in  large  systems. 

2.  A  new  method  for  reasoning  about  changes  to  a  system.  The  problem  of  reason¬ 
ing  about  changes  was  originally  formalized  by  this  author  in  his  Ph.D.  thesis 
[11].  That  solution  used  a  Hoare-style  logic  and  had  the  practical  drawback  of 
undecidability.  We  have  devised  a  radically  new  approach  to  the  problem  that 
involves  a  structural  approximation  of  the  semantic  effects  of  a  change.  An 
approximate  solution  is  intuitively  appealing  and  it  can  be  found  in  a  decidable 
theory. 

3.  A  new  technique  for  controlling  interconnections  in  a  system  implementation 
using  standard  program  analysis  and  formal  verification  technology.  The  ques¬ 
tion  of  whether  a  structural  specification  is  consistent  with  an  implementation 
is  decidable  under  certain  reasonable  assumptions. 

4.  An  innovative  prototype  system  —  called  PegaSys  —  that  uses  pictures  as  for¬ 
mal  documentation.  To  our  knowledge,  PegaSys  is  the  first  system  to  manipu¬ 
late  nontrivial  design  structures  in  ways  that  take  into  account  their  semantics. 
There  are  over  a  dozen  commercially  available  interactive  diagramming  systems 
that  have  proved  successful  in  industry  but  all  lack  semantics. 

Two  papers  are  attached  to  this  final  report  as  appendices.  They  explain  the 
points  above  in  some  detail.  The  remainder  of  this  section  summarizes  our  results  in 
the  three  areas. 

3.1  Structural  Designs 

To  illustrate  what  is  meant  by  structural  design,  consider  the  following  diagram: 


A  superficial  reading  of  this  diagram  might  be  that  boxes  A  and  B  are  abstract 
operations  and  the  bold  line  is  a  communication  channel  c  between  them.  Although 
this  reading  gives  a  general  idea  of  what  is  meant  by  the  diagram,  it  is  much  too 
imprecise  for  our  purposes.  Here  are  some  properties  the  diagram  might  indicate: 


•  Direct  communication:  Operations  A  and  B  communicate  directly  through 
channel  c  when  they  are  executed. 

•  Indirect  communication:  Operations  A  and  B  communicate  indirectly  through 
intermediaries  via  channel  c.  Indirect  relationships  result  from  the  cumulative 
effects  of  procedure  calls  on  nonlocal  objects  and  on  objects  passed  as  parame¬ 
ters. 

•  Completeness:  The  only  way  that  A  and  B  communicate  is  through  channel 
c.  That  is,  there  are  no  other  channels  between  A  and  B. 

A  formal  description  of  the  above  property  is  needed  so  that  there  is  no  ambigu¬ 
ity  about  what  is  intended  and  so  that  analysis  based  on  the  specification  will  be 
meaningful. 

The  formal  structural  specification  of  a  system  consists  of  multiple  levels  of  detail, 
each  level  containing  abstractions  appropriate  to  that  level.  Most  abstract  objects  are 
represented  naturally  as  primitive  procedures  or  variables  which  are  subsequently  im¬ 
plemented  in  terms  of  one  or  more  similar  objects.  On  the  other  hand,  most  abstract 
connections  are  expressed  as  derived  concepts  defined  in  terms  of  more  primitive  con¬ 
nections.  Abstract  connections  are  used  to  partition  a  system  into  manageable  parts 
that  interact  in  well-defined  and  predictable  ways. 

Next,  we  present  three  concrete  examples  of  structural  design  concepts,  defined 
in  terms  of  the  following  primitive  concepts  of  our  logic: 

mod(P.x)  means  that  procedure  P  modifies  variable  x  directly  or  indirectly  through 
a  called  procedure.  The  mod  relation  is  used  for  global  program  optimization 
in  compilers. 

P 

x  =>  y  means  that  information  flows  from  a  variable  r  to  a  variable  y  under  pro¬ 
cedure  P  provided  a  change  in  the  value  of  x  can  be  conveyed  to  y  when  P 
is  executed.  For  example,  the  binding  of  an  actual  parameter  a  to  a  formal 
parameter  x  causes  a  flow  from  a  to  xA 

=>/,  =>6,  =r>;  stand  for  forward,  backward,  and  lateral  information  flow,  respec¬ 
tively.  Forward  and  backward  flows  model  the  interprocedural  variable  bindings 
that  result  from  a  direct  or  transitive  procedure  call.  Lateral  flow  is  intrapro¬ 
cedural,  involving  local  variables  of  the  same  procedure.  Henceforth,  x  =s>  y  is 
taken  to  mean  that  (x,y)  is  a  forward,  backward,  or  lateral  flow. 

Classical  information  theory,  Shannon  [13]  and  others,  is  concerned  with  the  amount  of  infor¬ 
mation  generated  by  a  particular  event.  We  are  interested  in  the  simpler  qualitative  question  of 
whether  any  information  is  generated  by  an  event.  In  other  words,  we  are  interested  in  whether  a 
change  affects  an  object  at  all,  not  in  how  much  it  affects  it. 
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callByl  R(P,  Q , plisi)  models  a  procedure  call  from  procedure  P  directly  to  proce¬ 
dure  Q  with  an  arbitrary  number  of  actual- formal  parameter  pairs  ( plist )  having 
a  value-result  semantics. 

Types  var  and  proc  are  used  to  denote  variables  and  procedures,  respectively.  Vari¬ 
ables  of  type  war  are  used  to  specify  the  different  value  assignments  to  an  ordinary 
variable.  The  predicate  versionOf(x,y)  tells  whether  a  version  variable  x  is  associated 
with  a  variable  y  and  the  predicate  varOf(x.P)  tells  whether  version  variable  x  is 
associated  with  procedure  P. 

Example  1  Protecting  a  variable.  It  is  often  useful  to  restrict  access  to  a  variable 
or  to  restrict  the  ways  in  which  a  variable  can  be  used.  For  instance,  we  may  want 
to  allow  procedures  to  read  a  certain  variable  but  not  allow  them  to  write  it.  This  is 
easily  formalized  using  the  predicate 

ReadOnly:  var  x  proc  — >  bool 


which  is  defined  by 

ReadOnly(x,  P)  =f  Tnod (P,  x) 
for  x  in  var  and  P  in  proc.  For  a  given  variable  v.  we  can  write 

(Vp:  proc)ReadOnly(u,p) 

to  indicate  that  no  procedure  p  can  modify  v.  □ 

Example  2  Restricting  variable  interactions.  A  set  of  variables  can  be  partitioned 
into  independent  subsets  using  a  predicate  which  says  that  a  variable  x  is  completely 
independent  of  a  variable  y  if  and  only  if  a  change  in  the  value  of  y  has  no  effect  on 
the  value  of  x.  This  predicate 

IndependentOf:  var  x  var  — >  bool 

is  defined,  for  x  and  y  in  var ,  by 

IndependentOf(x, y)  d= 

(Vx,  y:  vvar)(VP:  proc)(versionOf(x,  x)  A  versionOf(y,  y)  D  ->(£  =k-  x)] 

If  a  variable  x  is  independent  of  a  variable  y ,  we  know  that  y  cannot  use  x  as  an 
intermediary  to  affect  some  other  variable  or  procedure.  □ 
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Example  3  Interprocedural  channel.  Suppose  that  we  want  two  procedures  to  com¬ 
municate  through  a  specific  variable.  We  say  that  a  variable  x  is  a  channel  from 
procedure  P  to  procedure  Q  iff  information  flows  from  P  to  Q  through  x.  This  is 
captured  by 

ChannelTo:  proc  x  proc  x  var  — »  bool 

which  is  defined  by 

ChannelTo(P,  Q,  x)  =f 

(3x,  y,  z :  vvar)(3P:  proc)[versionOf(x,  x)  A  varOf(y,  P)  A  varOf(3,  Q)  A 
(( y  ==>f  X  A  X  z)  V  (y  ==r*!>  xAi  ==>/  :)  V  (y  =^>b  x  Ax  i))j 

for  P  and  Q  in  proc  and  x  in  var.  Since  x  is  an  interprocedural  channel,  we  need  not 
consider  lateral  flows  whose  purpose  is  to  link  interprocedural  flows.  We  also  rule  out 
the  possibility  of  a  forward- backward  flow,  since  this  would  make  x  a  channel  from 
P  to  itself.  □ 

Example  4  Interprocedural  partitioning.  Assume  that  a  procedure  A  is  not  intended 
to  be  connected  to  a  procedure  B,  which  we  express  by 

-'ConnectedTo(A,  B ) 

The  ConnectedTo  relation  says  that,  for  any  procedures  P  and  Q ,  there  is  a  transitive 
call  from  P  to  Q.  or  a  transitive  information  flow  from  a  variable  referenced  by  P  to 
one  referenced  by  Q,  or  both.  The  predicate 

Calls:  proc  x  proc  — ►  bool 

is  defined  recursively  by 

Calls(P,Q)  d= 

(3p:  p/?sf)[callByVR(P,  Q,  p)  V 

(3 R:  proc) [callBy VR( P,  R,  p)  A  Calls(P,  Q)j] 

where  plist  is  a  set  of  possible  actual-formal  pairings.  The  predicate 

ConnectedTo:  proc  x  proc  — ►  bool 

is  defined  by 
ConnectedTo(P,  Q)  =f 

Calls(P,  O)  V  (3x,  y:  vvar)(3P:  proc)[varOf(x,  P)  A  varOf(y,  Q)  A  x  ===>  y] 

Notice  that  information  may  flow  from  P  to  Q  as  the  result  of  a  transitive  call  from 
P  to  Q  (in  which  case  R  is  P),  or  R  can  be  a  parent  of  P  and  Q  that  transmits  a 
return  flow  from  P  to  Q.  □ 
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PROCEDURE  Addlnc(sumj) 

ASSERT  calif  Add, (sum,i))  AND  callflnc.fi)) 
END: 


PROCEDURE  Add(a.b) 

ASSERT  affects(a.a)  AND  affects(b.a) 
END: 

PROCEDURE  Inc(z) 

ASSERT  calif  Add,(z,l)) 

END: 


sum  i  Addlnc 


Ga  -* — b  -« —  1  z  Add  -« - Inc 

^ _ f 


(a) 


(b) 


(c) 


Figure  2:  A  low-level  design:  (a)  textual  representation,  (b)  diagram  of  implicit 
information  flows,  and  (c)  graph  of  call  relationships. 

3.2  Reasoning  About  Changes 

It  is  undecidable  in  general  to  determine  the  exact  behavioral  effects  of  a  change,  but  it 
is  possible  to  obtain  a  precise,  conservative  approximation  by  formalizing  the  problem 
in  terms  of  structural  concepts.  In  particular,  we  say  that  a  change  to  an  object  x 
affects  an  object  y  if  the  pair  (x,y)  is  in  the  transitive  closure  of  the  information 
flow  relation  =>.  Unfortunately,  information  flow  is  not  transitive  in  the  usual  sense. 
That  is,  if  there  is  flow  from  some  object  x  to  an  object  y  and  from  y  to  an  object 
2,  there  is  not  necessarily  flow  from  x  to  2.  As  a  consequence,  the  usual  notion  of 
transitivity  gives  a  crude  approximation  of  the  effects  of  a  change.  To  obtain  a  more 
accurate  approximation  of  the  true  transitive  flows  (i.e.,  those  that  would  occur  when 
the  system  is  executed),  it  is  necessary  to  decompose  the  concept  of  information  flow 
into  the  three  special  flows  and  to  provide  axioms  for  composing  the  three  flows  to 
determine  the  transitive  closure  of  the  information  flow  relation. 

As  an  illustration  of  why  the  simple  approach  will  not  work,  consider  the  low-level 
design  presented  in  Figure  2a.  The  design  consists  of  several  objects:  procedures 
Addlnc ,  Add,  and  Inc,  and  variables  sum,  i,  a ,  b,  and  2.  Parameters  are  transmit¬ 
ted  using  a  value-result  semantics.  The  purpose  of  procedure  Addlnc,  which  is  not 
specified  in  the  figure,  is  to  add  the  initial  values  of  i  and  sum  and  return  the  result- 
in  sum:  it  also  increments  the  initial  value  of  1  and  returns  the  result  in  1. 
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We  are  interested,  for  instance,  in  whether  or  not  a  change  to  the  value  of  variable 
sum  can  affect  the  value  of  variable  z.  Since  two  objects  can  interact  indirectly 
through  any  number  of  intermediaries,  the  question  is  not  whether  sum  =?-  c,  but 
whether  the  pair  (sum.  z)  is  in  the  transitive  closure  of  =>  on  the  set  of  all  variables, 
written  sum  ==>  z. 

We  cannot  form  the  transitive  closure  of  =>  until  the  information  flow  relation¬ 
ships  implicit  in  the  assert  statements  are  made  explicit.  The  assertion  for  Adeline 
says  that  it  makes  two  calls,  one  to  Add  and  one  to  lnc\  the  ordering  of  the  calls  is 
unspecified  for  the  moment.  The  assertion  for  Add  says  that  the  initial  values  of  a 
and  b  affect  some  future  value  of  a.  The  assertion  of  Inc  specifies  that  it  calls  Add. 

Figure  2b  depicts  the  information  flow  relationships  implicit  in  this  structural 
description.  Flows  not  in  the  figure  are  assumed  to  be  invalid,  since  we  found  it 
useful  to  have  a  closed-world  assumption  [12].  For  example,  we  assume  there  is  no 
flow  in  Add  from  a  to  b ,  thereby  making  b  a  read-only  variable  of  Add.  Procedure  calls 
noimallv  cause  bidirectional  flow  between  actual  and  formal  parameters.  However, 
the  two  calls  to  Add  cause  only  unidirectional  flow  from  actual  parameters  i  and  1  to 
formal  parameter  b.  since  the  value  of  6  is  unchanged  by  Add. 

Returning  to  our  original  question  about  the  possibility  of  flow  from  sum  to  ;.  we 
can  use  the  diagram  Figure  2b  to  trace  the  information  flow  path 

sum  =>  a  ==>  z  ( 1 ) 

from  which  we  can  infer  that  sum  =>  z.  This  kind  of  reasoning  can  be  formalized 
in  terms  of  the  usual  transitivity  axiom. 

Unfortunately,  this  simple  analysis  is  much  too  conservative.  A  change  to  the  value 
of  sum  cannot  affect  2  if  there  is  no  execution  sequence  for  which  sum  =>  r.  We  can 
establish  that  there  is  no  such  sequence  in  our  example  specification  by  relating  the 
information  flow  relationships  in  the  specification  to  its  calling  relationships,  which 
are  illustrated  in  Figure  2c.  Consider  the  call  from  Addlnc  to  Add.  Procedure  Add 
contains  no  procedure  calls  and  it  does  not  allow  the  value  of  formal  a  to  affect  the 
value  of  its  other  formal  b.  Therefore,  the  call  from  Addlnc  to  Add  can  only  affect 
the  value  of  sum  by  means  of  the  path 

sum  =>  a  =>•  sum 

Procedure  Addlnc  also  calls  Inc  with  actual  parameter  i.  But  since  i  is  never  affected 
by  sum  (by  the  closed- world  assumption),  the  call  from  Addlnc  to  Inc  cannot  result 
in  a  flow  from  sum  to  z\  from  this  we  can  conclude  that  the  call  from  Inc  to  Add 
cannot  as  well.  This  completes  an  informal  argument  that  -■(sum  ==>  :). 

To  formalize  this  kind  of  reasoning,  a  new  axiomatization  of  the  transitivity  of 
information  flow  was  developed  in  which  a  transitive  flow  is  inferred  from  two  indi¬ 
vidual  flows  only  when  there  is  a  causal  relationship  between  the  individual  flows. 
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That  is.  we  establish  that  some  change  in  the  value  of  variable  x  in  the  flow  x  =>  ij 
causes  a  change  in  the  value  of  variable  r  in  the  flow  y  =t>  c  before  we  can  infer  the 
transitive  flow  x  =>  z. 

Let  T  denote  a  logical  axiomatization  of  transitive  information  flow  that  takes  into 
account  the  causal  relationships  among  individual  flows.  In  addition,  let  S  denote 
a  structural  specification,  and  let  J  denote  axioms  for  inferring  the  flows  implicit 
in  specifications.  To  reason  about  changes,  we  have  defined  T  and  J  so  that  the 
transitive  closure  of  =>.  namely. 

{(i,  ||)|  Tull)i5  L  x  =>  r/}, 

contains  the  true  information  flows  in  S  for  systems  containing  the  basic  structural 
features  discussed  at  the  beginning  of  this  section.  Moreover,  the  derivation  of  the 
closure  should  terminate  for  any  specification  S.  The  transitive  closure  with  respect 
to  a  given  specification  S  serves  as  the  basis  for  answering  questions  about  changes 
to  S. 

3.3  Controlling  Connections  Through  Proofs 

Given  a  structural  specification  S  of  a  system,  we  would  like  to  know  that  S  accurately 
describes  its  implementation.  This  too  is  undecidable  in  general,  but  again  we  can 
obtain  a  good  conservative  approximation. 

The  proof  strategy  blends  program  flow  analysis  and  formal  program  verification 
techniques.  It  suffices  to  must  show  that  each  level  in  a  structural  design  hierarchy 
is  a  logical  consequence  of  those  primitive  structural  relations  that  are  true  of  the 
program.  Objects  in  a  specification  are  connected  to  program  objects  with  a  mapping 
similar  to  the  one  in  Hoare  [7].  The  primitive  relations  satisfied  by  a  program  are 
derived  from  the  program  automatically  with  a  slightly  modified  version  of  standard 
program  flow  analysis  techniques  [1].  The  flow  analysis  is  conservative:  for  example, 
all  predicates  in  the  program  are  treated  as  uninterpreted  symbols.  Given  the  derived 
primitives  and  the  mappings,  the  problem  of  consistency  can  be  reduced  to  proving 
one  or  more  logical  implications  in  a  typed  first-order  logic,  where  a  type  is  a  finite 
and  fixed  set. 

3.4  The  Initial  PegaSys  Prototype 

PegaSvs  is  a  display-oriented,  interactive  environment  that  uses  intuitive  graphical 
pictures  as  formal  documentation  to  facilitate  life  cycle  activities  for  large  software 
systems.  PegaSys  has  been  designed  to  offer  the  advantages  of  mathematical  rigor 
even  though  users  interact  with  it  through  pictures.  For  example.  PegaSys  provides 
standard  graphical  operations,  via  a  mouse  and  pointing  device,  for  manipulating 


pictures,  while  at  the  same  time  enforcing  semantic  constraints  on  these  operations 
sufficient  to  guarantee  that  they  make  sense  in  terms  of  system  design.  As  a  re¬ 
sult.  PegaSys  users  can  document  and  explain  system  designs  in  a  highly  visual  and 
intuitive  manner. 

Because  of  their  intuitive  appeal,  pictures  have  been  used  frequently  by  computer 
scientists  in  textbooks,  professional  publications,  and  on  blackboards  to  explain  sys¬ 
tem  structures.  However,  pictures  tend  to  be  inadequate  as  a  means  of  documentation 
because  the  contain  imprecise  concepts  that  can  be  confusing  and  misleading.  For 
instance,  the  same  icon  is  often  used  to  represent  a  process,  a  subprogram,  and  a 
data  structure,  all  in  the  same  picture.  Similarly,  the  same  arrow  may  represent  the 
flow  of  data  to  a  subprogram,  the  flow  of  control  to  a  subprogram,  or  the  writing  of 
data  to  a  data  structure,  all  quite  distinct  concepts.  Failure  to  make  such  distinctions 
might  be  satisfactory  in  a  high-level  design,  but  is  not  acceptable  for  detailed  design 
refinements  that  serve  as  the  basis  for  system  evolution. 

The  goal  of  the  PegaSys  system  research  has  been  to  demonstrate  that  it  is  pos¬ 
sible  to  effectively  support  the  formal  specification  and  analysis  of  implementation 
structures.  The  approach  has  been  to  make  use  of  pictures  to  simplify  specifications 
and  to  take  advantage  of  decidability  to  eliminate  the  need  for  user  involvement  in 
proofs. 

The  initial  PegaSys  prototype  was  extremely  effective  in  creating  the  illusion  that 
logical  formulas  did  not  exist,  thereby  providing  the  advantages  of  formal  methods 
but  not  the  drawbacks.  The  PegaSys  prototype  deals  with  pictures  that  represent 
direct  connections  among  such  objects  as  variables,  types,  procedures,  and  modules 
in  sequential  systems.  As  explained  in  an  earlier  section,  we  have  since  extended 
our  specification  technique  to  include  indirect  relations.  Lower-level  objects,  such  as 
statements  and  e  xpressions.  are  not  modeled.  A  system  design  is  a  hierarchy  of  such 
pictures,  and  a  PegaSys  user  must  specify  the  mapping  between  levels  in  a  design. 
Given  this  mapping,  PegaSys  can  prove  that  the  design  levels  are  consistent  with  each 
other.  This  prototype  also  supported  programming  in  Ada  and  connected  pictures  to 
Ada  programs. 

We  are  presently  planning  a  new  implementation  of  PegaSys  that  incorporates  the 
advances  we  have  made  in  its  underlying  technology.  The  initial  FegaSys  prototype 
was  ritten  in  in  Interlisp-D  on  (now  obsolete)  Xerox  1100-series  personal  computers. 
The  aew  implementation  would  be  better  engineered,  written  in  portable  Common 
Lisp  on  Sun  workstations,  and  would  make  use  of  standard  components  whenever 
possible.  We  believe  that  the  planned  version  of  PegaSys  would  represent  an  impor¬ 
tant  step  in  the  introduction  of  formal  methods  in  the  engineering  of  real  software 
svstems. 
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4  Related  Research 


1.  The  PegaSys  system  is  unique  in  its  use  of  graphics  to  mask  details  in  the  formal 
design  and  proof  of  structural  system  properties.  Other  graphical  systems  have 
very  impoverished  structural  design  languages  (see  below)  and  do  not  perform 
a  semantic  analysis  of  designs.  An  interesting  graphical  system  for  behavioral 
specification  has  recently  been  developed  by  Harel  [3]. 

2.  Previous  work  on  structural  design  languages  falls  into  the  following  categories: 

•  Programming  languages.  Block  structure,  import/export  lists,  and 
encapsulation  mechanisms  have  been  used  to  specify  referencing  environ¬ 
ments  in  several  programming  languages,  including  Euclid,  Mesa.  Modula- 
2,  Ada.  and  PIC/ Ada.  These  features  describe  access  to  objects  but  not  the 
use  of  objects.  Furthermore,  they  deal  solely  with  program-level  objects. 
Module  interconnection  languages  have  essentially  the  same  drawbacks. 

•  Program  design  languages.  Over  a  dozen  structural  design  languages 
have  been  developed  since  they  were  made  popular  by  Yourdon,  De  Marco, 
and  others;  see  [9]  for  a  survey.  These  languages  typically  represent  pro¬ 
gram  structure  as  a  directed  graph  in  which  nodes  denote  program  objects 
and  arcs  denote  structural  relations  among  objects.  Relations  in  a  graph 
are  low  level  and  not  formally  defined.  Moreover,  there  is  no  mechanism 
for  properly  defining  new  relations;  thus,  it  is  not  possible  to  formalize 
many  common  design  abstractions. 

•  Formal  specification  languages.  These  languages,  e.g,  Anna  [8],  Larch 
[6],  and  OBJ  [5],  focus  on  behavior,  not  structure.  Some  contain  a  form  of 
import/export  list. 

•  Derivational  techniques.  Program  transformations  describe  how  to 
transform  a  given  structure  into  a  different  and  possibly  more  efficient 
structure.  A  related  technique  presently  gaining  much  attention  involves  a 
use  of  constructive  type  theory  in  which  programs  are  correct  by  construc¬ 
tion.  Both  approaches  have  been  applied  primarily  to  algorithm  design. 
In  contrast,  our  approach  is  focused  on  how  algorithms  are  put  together 
to  form  a  system. 

3.  Previous  work  on  the  anaRsis  of  system  changes  is  either  at  too  low  a  level  or 
limiced  by  undecidability: 

r  Program  inspection.  In  practice,  the  dominant  way  of  determining  the 
•'■fleets  of  a  change  is  for  the  human  to  interpret  various  relations  extracted 


11 


from  the  program  itself.  The  relations  include  direct  (cross-reference)  re¬ 
lations  and  transitive  relations  about  calling  relationships  and  data  flows. 
Numerous  tools  have  been  developed  which  derive  such  relations,  but  they 
are  used  primarily  for  other  purposes,  such  as  program  optimization  [2], 
the  detection  of  simple  errors  [4],  and  documentation  [10J.  We  are  not 
aware  of  any  existing  tool  that  provides  a  systematic  way  of  combining  the 
relations  to  determine  the  effects  of  changes. 

•  Semantic  proofs.  In  the  context  of  formal  program  verification,  Moriconi 
[11]  developed  a  general  approach  to  reasoning  about  the  semantic  effects 
of  changes.  Given  a  functional  specification  of  a  system  and  a  Hoare-style 
logic,  the  method  determines  what  formulas  must  be  proved  to  isolate 
the  exact  semantic  effects  of  incremental  changes.  Unfortunately,  these 
formulas  are  in  an  undecidable  theory,  and  experience  indicates  that  they 
cannot  be  proved  without  substantial  human  assistance.  Consequently, 
the  approach  is  impractical  for  everyday  use. 
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Visualizing  Program  Designs 
Through  PegaSys 


Mark  Moriconi  and  Dwight  F.  Hare,  SRI  International 


PegaSys  is  concerned 
more  with  explaining 
program  design  than 
describing  programs t, 
and  offers  more 
extensive  support  to 
programming  in  the 
large  them  other 
graphical  systems. 


This  article  is  an  introduction  to 
many  of  the  interesting  features 
of  PegaSys,*  an  experimental  system 
that  encourages  and  facilitates  extensive 
use  of  graphical  images  as  formal,  ma- 
chine-processable  documentation.  Un¬ 
like  most  other  systems  that  use  graph¬ 
ics  to  describe  programs,  the  main 
purpose  of  PegaSys  is  to  facilitate  the 
explanation  of  program  designs. 

A  program  design  is  described  in 
PegaSys  by  a  hierarchy  of  interrelated 
pictures.  Each  picture  describes  data 
and  control  dependencies  among  such 
entities  as  "subprograms,”  “pro¬ 
cesses,”  "modules,”  and  "data  ob¬ 
jects,”  among  others.  The  dependen¬ 
cies  include  those  represented  in 
flowcharts,  structure  charts,  dataflow 
diagrams,  and  module  interconnection 
languages.  Moreover,  new  abstrac¬ 
tions  can  be  defined  as  needed. 

What  is  particularly  interesting 
about  PegaSys  is  its  ability  to:  (1) 
check  whether  pictures  are  syntactical¬ 
ly  meaningful,  (2)  enforce  design  rules 
throughout  the  hierarchical  decompo¬ 
sition  of  a  design,  and  (3)  determine 
whether  a  program  meets  its  pictorial 
documentation.  Much  of  the  power  of 
PegaSys  stems  from  its  ability  to  repre¬ 
sent  and  reason  about  different  kinds 
of  pictures  within  a  single  logical 
framework.  This  framework  is  trans¬ 
parent  to  PegaSys  users  in  the  sense 

*  Programming  Environment  for  the  Graphical 
Analysis  of  SYStems. 


that  interactions  are  in  terms  of  icons 
in  pictures.  For  example,  formal  prop¬ 
erties  of  a  program  are  described  by 
standard  graphical  operations  on  icons 
rather  than  by  sentences  written  in  a 
formal  logic. 

Excerpts  from  a  working  session 
with  PegaSys  are  used  to  illustrate  the 
basic  style  of  interaction  as  well  as  the 
three  PegaSys  capabilities. +  We  de¬ 
scribe  the  key  ideas  behind  PegaSys 
elsewhere. :J 

Background  and  related  work 

Pictures  have  been  used  extensively 
by  computer  scientists  in  textbooks, 
professional  publications,  and  on 
blackboards  to  explain  dependencies 
in  programs.  .Although  pictures  may 
be  quite  perspicuous,  they  have  tended 
to  be  inadequate  as  a  means  of  doc¬ 
umentation.  One  reason  is  the  use  of 
imprecise  concepts  that  result  in  pic¬ 
tures  that  are  confusing  and  easily 
misinterpreted.  For  example,  the  same 
graphic  symbol  is  often  used  to  repre¬ 
sent  a  process,  a  subprogram,  and  a 
data  structure,  all  in  the  same  picture. 
Similarly,  an  undifferentiated  arrow 
might  represent  the  flow  of  data  to  a 
process,  the  flow  of  control  between 
subprograms,  or  the  writing  of  data 


tThis  article  is  a  condensed  version  ol  a  caoer  con¬ 
tained  m  a  technical  report. 1  Because  of  soace 
limitations,  we  nave  removed  many  ot  the  pictures 
that  describe  the  design  of  the  example  svstem  de¬ 
veloped  during  the  session. 
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into  a  data  structure,  ail  quite  distinct 
concepts. 

While  formal  documentation  does 
not  suffer  from  this  imprecision,  its 
advantages  have  tended  to  be  out¬ 
weighed  by  the  difficulty  of  construct¬ 
ing  and  understanding  it.  Moreover, 
formal  documentation  has  inade¬ 
quately  captured  dependencies  among 
components  of  the  program  it  is  in¬ 
tended  to  describe.  An  understanding 
of  such  dependencies  is  crucial  through¬ 
out  the  software  life  cycle,  especially 
during  maintenance,  and  becomes  in¬ 
creasingly  more  difficult  to  glean  from 
a  program  as  it  increases  in  size  and 
complexity. 

In  light  of  these  observations, 
PegaSys  attempts  to  take  advantage  of 
pictorial  communication  in  describing 
data  and  control  dependencies  while, 
at  the  same  time,  maintaining  the  ad¬ 
vantages  of  mathematical  rigor. 
PegaSys  is  differentiated  from  pre¬ 
vious  graphical  systems  by  its  wider 
range  of  representation  and  analysis 
and  its  more  extensive  support  for 
programming  in  the  large.  Previous 
work  most  closely  related  to  PegaSys 
is  concerned  with  representation  and 
analysis  techniques.  We  review  this 
work  and  then  describe  related 
systems. 

For  a  system  to  perform  any  son  of 
meaningful  analysis  of  a  picture,  it 
must  maintain  a  logical  representation 
of  the  picture.  A  number  of  formal¬ 
isms  have  been  developed  that  have,  or 
easily  could  have,  a  pictorial  render¬ 
ing.  Examples  are  flowcharts,'*  struc¬ 
ture  charts,5  pictographs, 6  dataflow 
diagrams  (surveyed  in  Davis  and 
Keller7),  plans,8  and  module  intercon¬ 
nection  languages.9-12  All  of  these  for¬ 
malisms  capture  data  and  control 
dependencies,  typically  down  to  exe¬ 
cutable  program  fragments.  Pictures 
in  PegaSys  describe  what  we  believe  to 
be  the  important  design  concepts  in 
these  formalisms,  plus  other  concepts 
as  well. 


The  presence  of  a  logical  representa¬ 
tion  for  a  picture  provides  a  basis  for 
reasoning  about  the  picfcc.  In  addi¬ 
tion  to  checks  for  syntax  errors,  two 
other  sorts  of  syntactic  analysis  of  pic¬ 
tures  have  been  performed  by  previous 
systems.  The  first  involves  the  hierar¬ 
chical  refinement  of  a  picture.  If  we 
think  of  a  picture  as  a  graphlike 
diagram,  a  node  in  a  diagram  may  be 
replaced  by  a  diagram  provided  that 
the  replacement  preserves  the  connec- 
tivity  of  the  original  diagram.  Example 
uses  of  this  idea  can  be  found  in  Davis 
and  Keller7  and  Rich  and  Shrobe.8 
The  second  sort  of  analysis  concerns 
the  relationship  between  a  picture  and 


Pictures  in  PegaSys  describe  how 
algorithms  and  data  structures  fit 
together  to  form  the  design  of  a 
larger  program . 


the  program  it  is  intended  to  describe. 
If  a  picture  is  not  executable,  it  is  im¬ 
portant  to  verify  whether  it  accurately 
describes  the  program.  For  example, 
the  flow  of  control  in  a  program  can  be 
determined  purely  syntactically  if  we 
assume  that  conditional  control  paths 
may  always  be  executed.  Similarly,  the 
“uses”  and  “requires”  relations  in 
module  interconnection  languages  can 
be  verified  using  type-checking  tech¬ 
niques. 12  In  contrast,  PegaSys  addi¬ 
tionally  places  semantic  constraints  on 
design  refinements  and  programs. 

One  such  constraint  deals  with  the 
logical  consistency  between  a  picture 
and  the  program  it  is  intended  to  de¬ 
scribe.  Traditionally,  program  verifi¬ 
cation  efforts  have  employed  general 
methods  for  establishing  the  logical 
consistency  between  a  formal  specifi¬ 
cation  and  a  program. 13  The  PegaSys 
verification  procedure  is  more  special¬ 
ized  and  simpler,  and  does  ">ot  have 


the  practical  drawbacks  of  traditional 
approaches. 

A  related  system  that  deals  with  pro¬ 
gram  dependencies  is  the  PECAN  sys¬ 
tem.  I4’13  PECAN  provides  multiple 
“views”  of  a  program  by  extracting 
dependencies  directly  from  a  program 
and  then  displaying  them  graphically. 
A  similar,  albeit  nongraphical,  ap¬ 
proach  at  the  level  of  specifications  is 
described  in  Swartout.16  The  ap¬ 
proach  taken  by  PegaSys  differs  in 
that  the  program  designer  is  responsi¬ 
ble  for  describing  a  program  in  terms 
of  the  abstractions  used  in  its  concep¬ 
tualization.  This  approach  is  based  on 
our  belief  that  it  is  difficult,  if  not  im¬ 
possible,  to  generate  these  abstractions 
from  the  final  program. 

Other  related  systems  that  make  ex¬ 
tensive  use  of  graphics  to  describe  as¬ 
pects  of  programs  fall  into  two  major 
categories.  First,  there  are  a  number  of 
systems  for  “animating”  dynamic 
program  execution,  a  good  example  of 
which  is  the  Balsa  system. 17  Balsa  cre¬ 
ates  simulations  in  which  sophisticated 
graphical  representations  of  an  algo¬ 
rithm  and  its  data  structures  are  con¬ 
tinually  updated  throughout  the  exe¬ 
cution  of  the  algorithm.  There  are  other 
examples  of  animation  systems. 1T  - 
The  second  category  is  concerned  with 
“visual  programming,”  i.e.,  program¬ 
ming  by  spatial  arrangement  of 
icons. 23-27  Both  kinds  of  systems  have 
tended  to  focus  almost  exclusively  on 
programming  in  the  small— that  is,  on 
individual  algorithms  and  data  struc¬ 
tures.  Pictures  in  PegaSys,  on  the 
other  hand,  describe  how  algorithms 
and  data  structures  fit  together  to  form 
the  design  of  a  larger  program. 

Getting  started 

Figure  1  shows  a  bitmap  display 
connected  to  a  Xerox  personal  com¬ 
puter.'  Screen  real  estate  is  divided 


■PegaSys  '5  imoiementeo  m  lnteriiso-0  ana  runs 
on  Xerox  1100-senes  personal  comouters. 
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Figure  1.  ^irst  level  in  broadcast  network  documentation  hierarchy. 


into  adjacent,  nonoverlapping  rectan¬ 
gular  areas  called  windows.'  Screen 
layout  will  be  adjusted  throughout  the 
scenario  in  an  attempt  to  make  max¬ 
imal  use  of  screen  real  estate.  This  is 
done  by  pointing  with  the  mouse.  The 
small  windows  down  the  right-hand 
side  are  menus  containing  commands, 
each  of  which  may  be  selected  by  point¬ 
ing  at  it  with  the  mouse.  The  black  strip 
at  the  top  of  each  window  contains  the 
window’s  name.  The  name  of  a  win¬ 
dow  is  intended  to  be  suggestive  of  its 
contents.  For  example,  the  name 
“Network: Level  1”  indicates  that  the 
contents  of  the  associated  window  is 


-Our  di30!ay  management  strategy  s  sattemeg 
girectly  after  me  tiling  strategy  used  m  Cedar  -8 


the  first  level  of  the  picture  hierarchy 
for  a  network. 

For  the  most  part,  arguments  to 
commands  are  selected  or  constructed 
by  pointing,  and  pictures  are  manipu¬ 
lated  by  pointing  as  well.  On-line  help 
and  feedback  on  errors  appear  in  the 
prompt  window  in  the  lower-left  cor¬ 
ner  of  Figure  1. 

The  example  session  used  to  illus¬ 
trate  aspects  of  PegaSys  is  concerned 
with  the  development  of  a  realistic 
broadcast  network.  It  is  not  necessary 
to  understand  the  details  of  the  net¬ 
work  or  its  implementation  in  order  to 
get  a  “feel’’  for  the  capabilities  being 
demonstrated.  Particularly  germane 
aspects  of  the  example  network  are  ex¬ 
plained  as  needed.  .As  the  session  pro¬ 


gresses,  details  of  the  network  are 
omitted  so  as  to  focus  attention  on  the 
aspect  of  PegaSys  being  described. 

The  session  begins  with  the  design  of 
the  overall  broadcast  network  down  to 
the  host  level.  It  then  focuses  on  the 
development  of  a  single  host,  whose 
multilevel  design  and  implementation 
is  reused  several  times  in  the  overall 
network.  The  network  was  imple¬ 
mented  in  the  Ada  programming  lan¬ 
guage-9  (using  PegaSys)  and  subse¬ 
quently  run  on  a  Data  General  MV 
10000  computer. 

The  meaning  of  a  picture 

A  crucial  aspect  of  the  PegaSys 
design  is  its  treatment  of  a  picture  as 
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This  module  models  a  physical  communication  line  which 
broadcasts  information  to  ail  connected  entitles.  -  •  , 
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Figure  2.  Explanatory  text  may  be  associated  with  computationally  meaningful 
icons. 


both  a  graphical  and  a  logical  struc¬ 
ture.  These  structures  affect  user  in¬ 
teraction  with  PegaSys  in  several  im¬ 
portant  ways. 

Dual  interpretation  of  pictures.  A 

picture  is  represented  as  a  graphical 
structure  composed  of  icons  and  their 
properties,  such  as  size  and  location. 
Icons  in  a  picture  correspond  to  predi¬ 
cates  in  the  underlying  logical  repre¬ 
sentation  of  the  picture.  This  logical 
structure  captures  the  computational 
meaning  of  a  picture;  each  predicate  in 
this  structure  denotes  a  computational 
concept  expressed  by  the  picture. 

The  picture  shown  in  Figure  1  con¬ 
tains  several  icons;  four  ellipses,  a  rect¬ 
angle,  several  arrows,  and  several 
character  strings.*  These  icons  denote 
several  concepts  about  the  example 
network.  Each  of  the  four  hosts  in  the 
network  is  modeled  as  a  process  (indi¬ 
cated  by  an  ellipse);  the  communication 
line  by  a  module  (indicated  by  a  dashed 
rectangle);  and  a  packet  of  data  by  a 
type  (indicated  by  a  label  on  arcs).t  In¬ 
terrelationships  among  hosts,  packets, 
and  the  line  are  described  by  the 
“write”  relation  (denoted  by  the  letter 
W  on  arrows)  and  the  “read”  relation 
(denoted  by  R). 

At  a  first  approximation,  the  picture 
says  that  the  broadcast  network  con¬ 
sists  of  four  hosts  that  communicate 
by  means  of  a  line.  More  precisely, 
processes  named  Hostl,...,Host4  write 
values  of  type  pkt  into  a  module  called 


•Note  that  type  ok;  is  'eoresented  by  text  in  trie 
lower  'eft  of  the  oiciure  rather  than  oy  an  icon  If 
PegaSys  ooes  not  nave  an  aoorooriate  icon  for  a 
conceot.  .he  o„.ivennon  is  to  aisoiay  its  logical 
reoresentation  as  text. 


•A  process  seguentiaMy  executes  a  series  of  ac¬ 
tions  that  may  oroceeo  m  oaraMei  with  actions  of 
other  orocesses:  a  motJufe'S  a  collection  of  one  or 
more  logically  related  entities:  ana  a  rvpe  is  a. 
oossibiy  structured,  value  set.  The  line  is  not 
modeled  as  a  orocess  oecause  its  actions  are  initi¬ 
ated  by  nosts 


Line  and  read  values  of  the  same  type 
from  the  Line  module. 

This  statement  about  the  network  is 
represented  in  PegaSys  by  a  conjunc¬ 
tion  of  the  predicates 
process  ( Host J),  process  ( Host2 ), 
process  ( Host3 ),  process  (Host4), 
module  {Line),  Type  (pkt), 
write  (Host I,  Line,  pkt), 

Read  ( Host 1,  Line,  pkt) 

with  similar  Write  and  Read  predicates 
involving  Host2,  Host3,  and  Host4. 
Notice  that  every  predicate  corre¬ 
sponds  to  a  different  icon  in  Figure  1. 
Purely  cosmetic  changes  to  a  picture, 
such  as  an  adjustment  to  the  size  or 
location  of  an  icon,  do  not  require  up¬ 
dates  to  the  logical  representation  of 
the  picture. 

The  logic  in  which  pictures  are  rep¬ 
resented  is  called  the  form  calculus.  A 
syntactically  correct  picture  is  said  to 
describe  the  form  of  a  program  and  is 
represented  by  a  well-formed  formula 
of  the  form  calculus. 

The  following  terminology  will  be 
adopted  to  refer  to  components  of  a 
picture.  Active  entities  may  initiate  ac¬ 
tions  that  create,  destroy,  or  transform 
data  objects  (variables);  the  data  ob¬ 
jects  themseives  are  called  passive  en¬ 
tities.  The  existence  of  an  active  or 


passive  entity  is  determined  by  its 
membership  in  a  defining  relation. 
An  example  of  an  active  entity  is  pro¬ 
cess  (Host I),  and  an  example  of  a 
passive  entity  is  Type  (pkt).  The  term 
entity  refers  to  both  kinds  of  entities. 
A  relationship  among  entities,  such  as 
specified  by  the  Write  relation,  is 
called  an  interaction. 

Entities  and  interactions  specified 
in  pictures  correspond  to  either 
primitives  of  the  form  calculus  or 
predicates  defined  in  terms  of  the 
primitives.  The  primitives  were  care¬ 
fully  chosen  to  facilitate  the  defini¬ 
tion  of  new  concepts. : 

A  brief  summary  of  the  primitives 
will  suggest  the  general  kinds  of  con¬ 
cepts  that  can  be  expressed  by  pictures 
in  PegaSys.  Active  entities  are  spec¬ 
ified  by  “subprogram,”  “process,” 
and  “module”  relations.  We  have 
chosen  this  relatively  course  grain  in  an 
attempt  to  capture  the  salient  aspects 
of  the  design  of  a  program,  as  opposed 
to  the  details  of  particular  algorithms. 
Passive  entities  are  specified  by  a 
“name”  relation  and  by  “simple 
type”  and  “structured  type”  relations. 
A  name  is  used  to  refer  to  the  object 
and  a  type  to  denote  a  (possibly  struc¬ 
tured)  value  set.  The  manipulation 


August  1985 


75 


...  {■  .  v  »•:<- 


v  Typo(pkt) 


Nerwo r*  •  '.9v»i  2 


Edit  Text 
Edit  Picture 
Hierarchy 
Program 
Clear 
Resize 
Clock 
Alarm 
Lisp 
Exit 
Logout 
d  Hierarchy  ■ 
Edit 
Verify 

I  Refine  Level  I 

|  Delete  Level  | 
Save 
£git  Picture 
Draw 
Refine 

|  Abort  Refine  | 
End  Refine 
View 
Scratch 


Add  Entity 
Add  Rel. 
Delete 
Display 
Disomy  1 
Expand  Icon 
Iconic  form 
Show  Rela. 
Show  Text 
■  Clear., 
Redisplay 
'  Rel  Def 


Window  icons 


Figure  3.  Creation  of  a  level  and  selection  of  the  line  for  refinement. 


and  sharing  of  data  objects  are  spec¬ 
ified  by  means  of  primitive  interaction 
relations  that  capture  general  notions 
of  data  object  declaration,  data  object 
visibility,  aliasing  of  names,  modifica¬ 
tion  of  the  value  of  a  data  object,  and 
accessing  the  value  of  a  data  object. 
There  are  also  primitives  for  modeling 
(synchronous  and  asynchronous)  in¬ 
terprocess  communication  and  ordi¬ 
nary  transfer  of  control.  See  reference 
3  for  details. 

Finding  out  about  what  is  not  in  a 
picture.  A  picture  may  be  augmented 
with  explanatory  text.  In  particular, 
text  may  be  associated  with  any  com¬ 
putationally  meaningful  icon.  If  the 
user  points  at  an  icon  and  presses  a 
button  on  the  mouse,  the  associated 


pop-up  comment  will  appear  on  the 
display  until  the  user  releases  the  but¬ 
ton.  Figure  2  shows  the  pop-up  com¬ 
ment  for  the  line  module.  Given  this 
and  related  features  of  PegaSvs,  the 
best  wav  to  gain  an  understanding  of 
the  pictures  presented  here  is  by  means 
of  an  interactive  dialog  with  PegaSys. 


Manipulation  of  pictures 

Interactions  with  PegaSys  are  in 
terms  of  icons.  However,  graphical 
operations  on  pictures  are  restricted  by 
logical  constraints  imposed  by  the 
form  calculus.  These  constraints  are 
intended  to  ensure  that  graphical  oper¬ 
ations  make  sense  computationally. 


Graphical  manipulation  of  pictures. 

Graphical  manipulation  of  a  picture 
depends  upon  a  one-to-one  mapping 
between  computationally  meaningful 
icons  and  predicates.  An  icon  and  its 
associated  predicate  denote  the  same 
concept. 

Perhaps  the  simplest  example  of 
how  PegaSys  takes  advantage  of  this 
mapping  concerns  the  selection  of  a 
concept,  which  is  done  by  pointing  at 
the  appropriate  icon.  For  example, 
positioning  the  mouse  to  point  at  the 
ellipse  labeled  Hostl  in  Figure  1  and 
clicking  (depressing  and  releasing)  a 
button  on  the  top  of  the  mouse 
results  in  selection  of  the  predicate 
process  (Hostl). 

Another  example  concerns  the  con¬ 
struction  of  pictures.  Pictures  are  con¬ 
structed  by  using  a  series  of  graphical 
operations  on  the  display  that  have  the 
twofold  effect  of  building  a  graphical 
structure  and  a  corresponding  formuia 
in  the  form  calculus.  Each  operation 
involves  the  selection  of  a  concept 
from  a  menu  followed  by  its  placement 
at  a  location  on  the  screen.  .An  icon  is 
associated  automatically  with  most 
concepts.  If  a  concept  must  be  named, 
the  user  must  enter  a  name  for  it  ana 
PegaSys  will  size  the  associated  icon 
relative  to  the  size  of  the  name.  Place¬ 
ment  is  done  by  pointing.  Layout  ad¬ 
justments  may  be  made  by  pointing  at 
the  desired  icon  (selection),  pointing  at 
the  destination  location,  and  clicking  a 
button  on  the  mouse.  PegaSys  reposi¬ 
tions  the  selected  icon  at  the  specified 
location,  readjusting  related  icons 
(such  as  connected  arrows)  as  best  it 
can. 

Logical  constraints  on  graphical 
manipulations.  Both  syntactic  and 
semantic  constraints  are  placed  on 
graphical  manipulations.  .An  example 
of  the  former  concerns  the  construc¬ 
tion  of  pictures.  While  pictures  are 
constructed  by  means  of  standard 
graphical  operations,  the  form  calcu¬ 
lus  guides  the  entire  process.  PegaSys 
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uses  the  grammar  of  the  form  calculus 
to  guide  the  construction  of  pictures  in 
much  the  same  way  that  a  structure- 
oriented  editor  uses  the  grammar  of  a 
programming  language  to  guide  the 
construction  of  programs.  Pictures 
may  contain  only  concepts  that  are 
primitive  in  the  form  calculus  or  that 
have  been  defined  in  terms  of  the 
primitives.  PegaSys  uses  the  type  con¬ 
straints  on  predicates  to  prevent  a 
nonsensical  composition  of  concepts. 
For  example,  if  a  predicate  has  been 
defined  to  take  two  processes  as  its 
arguments,  PegaSys  ensures  that  both 
arguments  are  provided  and.  more¬ 
over,  that  both  are  processes.  If  not,  a 
type  error  is  signaied. 

Semantic  constraints  are  needed  to 
restrict  picture  refinements  and  to 
analyze  the  relationship  between  a  pic¬ 
ture  hierarchy  and  the  program  it  is  in¬ 
tended  to  describe.  In  both  instances, 
it  is  necessary  to  prove  logical  for¬ 
mulas  in  the  form  calculus.  However, 
this  can  be  done  quickly  and  without 
user  interaction  (due  in  pan  to  the 
decidability  of  the  form  caicuius). 


Hierarchical  decomposition  of 
pictures 

A  hierarchy  of  pictures  related  ac¬ 
cording  to  the  PegaSys  design  rules  is 
said  to  describe  the  design  of  a  program. 

Creating  a  new  level  in  a  hierarchy. 

Each  level  in  a  picture  hierarchy  is  a 
descnption  of  a  program  at  a  par¬ 
ticular  level  of  detail.  A  level  is  formed 
by  a  sequence  of  refinements  to  the  im¬ 
mediately  preceding  level  in  the  hierar¬ 
chy.  A  refinement  adds  detail  to  an  ex¬ 
isting  concept  and  is  not  allowed  to 
delete  concepts  from  a  picture.  There¬ 
fore,  a  concept  cannot  appear  at  any 
level  in  a  hierarchy  (except  the  top  one) 
unless  it  is  a  refinement  of  a  more 
abstract  concept. 

The  procedure  for  building  a  new 


Figure  4.  Constructing  a  replacement  for  the  line. 


Figure  5.  Picture  at  level  2  after  refinement  of  the  line. 


level  in  a  hierarchy  is  as  follows.  As 
soon  as  the  user  indicates  a  desire  to 
create  a  new  level,  PegaSys  makes  a 
copy  of  the  immediately  preceding 
level.  The  new  level  is  formed  by  a  se¬ 
quence  of  refinements  to  this  copy.  .An 
individual  refinement  involves  the 
following  three  steps:  (1)  selection  (by 
pointing)  of  the  relation  to  be  refined, 
(2)  construction  or  selection  of  its  re¬ 
placement.  and  (3)  selection  of  the  ap¬ 
propriate  menu  command.’  PegaSys 
checks  whether  the  refinement  satisfies 
its  design  rules. 


•This  is  a  good  example  of  the  modeless  style  of 
interaction  suooorted  Oy  PegaSys  m  that  argu¬ 
ment  selection  precedes  command  selection.  See 
Tesier  s  discussion  ot  this  aoproach  to  man- 
machine  interfacing.  M 


Refinement  of  an  active  entity.  Re¬ 
call  that  an  active  entity  is  an  entity 
that  has  the  ability  to  manipulate  data. 
The  active  entities  in  Figure  1  are  the 
host  processes  and  the  line  module. 
The  next  step  in  the  scenario  illustrates 
a  refinement  technique  called  acme 
entity  refinement — the  first,  and  sim¬ 
plest,  of  three  refinement  techniques 
employed  in  the  network  devel¬ 
opment. 

Provided  the  replacement  preserves 
interactions  involving  the  replaced  en¬ 
tity,  PegaSys  allows  an  active  entity  to 
be  replaced  by  a  picture.  The  three 
steps  in  an  active  entity  refinement  are 
illustrated  by  Figures  3  through  5.  The 
window  at  the  bottom  of  the  display 
(see  Figure  3)  contains  a  copy  of  level 
1 ,  where  the  Line  module  has  been  se- 
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Figure  6.  Level  1  of  protocol  hierarchy  for  a  host  computer.  The  upper-left  window  contains  a  picture  of  the  entire  level, 
which  is  explained  by  the  views  in  the  other  three  windows.  This  figure  is  read  starting  with  (a)  for  the  upper-left  window  and 
progressing  clockwise  for  windows  (b)  through  (d). 


leered  for  refinement  (indicated  by 
boldface  highlighting).  The  user  next 
constructs  a  replacement  for  the  line, 
as  shown  in  Figure  4.  Solid-lined  rect¬ 
angles  denote  “subprograms”  that  are 
intended  to  specify  the  interface  to  the 
line;  Out  JPkt  is  a  variable  modified  by 
these  subprograms.*  Finally,  the  user 
indicates  (by  pointing]  its  exact  con¬ 
nection  between  the  replacement  for 
the  line  and  the  hosts  (see  Figure  5). 
This  completes  the  refinement,  and 


’The  access  and  mod  'e'ations  m  Figure  4  require 

Out _ ‘o  oeavanacie  ceicnqinq  :o  the  cnysicai 

ine  module 


PegaSys  checks  that  interactions  at 
level  1  are  preserved  at  this  level  and 
that  the  entire  picture  at  level  2  satisfies 
the  type  constraints  imposed  by  the 
form  calculus.  (PegaSys  accepts  only 
well-formed  pictures  and.  therefore, 
requires  that  type  errors  be  removed 
before  it  attempts  any  further  analysis 
of  the  offending  picture.) 

Figure  5  also  illustrates  a  useful 
aspect  of  picture  layout  in  PegaSys. 
Even  though  the  Snd  and  Rev  entities 
appear  four  times  each  at  level  2,  they 
describe  only  one  interface  to  the  line 
and,  therefore,  appear  only  once  in  the 
internal  logical  representation  of  the 


picture.  In  general,  duplication  is  a 
good  technique  for  avoiding  crossover 
and  curved  lines. 

Views  are  used  to  manage  complexi¬ 
ty.  We  are  now  ready  to  design  the 
hosts  in  the  network.  Rather  than  de¬ 
signing  four  separate  hosts,  our  strate¬ 
gy  wiil  be  to  design  one  host  and  then 
“replicate”  it  four  times  at  level  2  of 
the  network  hierarchy  (see  Figure  5). 

Figure  6a  shows  the  first  level  in  the 
design  hierarchy  for  3  host.  This  pic¬ 
ture  is  not  particularly  perspicuous 
because  it  mixes  several  important 
properties  of  a  host.  These  properties 
may  be  separated  by  means  of  tnree 
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views  (explained  below),  obviating  the  Figure  6c  describes  message  trans-  design.  (Pictures  only  suggest  the 
need  to  study  Figure  6a.  mission.  The  sender  sends  a  message  semantics  of  entities  by  means  of 

In  general,  multiple  views  of  the  (msg)  over  the  open  channel  (chan)  mnemonic  entity  names.  It  is  expected, 

same  picture  are  used  to  manage  com-  and  receives  back  an  indication  as  to  but  in  no  way  enforced,  that  the  refine- 

plexitv  or  to  emphasize  particular  whether  the  transmission  was  sue-  ment  of  an  entity  provide  a  more 

aspects  of  a  picture.  A  view  in  PegaSvs  cessful.  The  receiver,  on  the  other  detailed  description  of  the  computa- 

is  a  single  grouping  of  logically  related  hand,  tells  the  data  packet  protocol  tion  suggested  by  the  entity  name.)  For 

icons  from  a  picture.  Views  are  pres-  that  the  channel  called  chan  is  open  refinements  of  interactions,  it  is  possi- 

entlv  constructed  interactively  by  and  awaits  the  arrival  of  a  message.  ble  to  enforce  stringent  logical  re¬ 
structured  selection  and  posiuoning  of  The  third  view,  shown  in  Figure  6d.  quirements — in  particular,  a  refine- 

related  icons.  A  more  sophisticated  describes  the  interface  between  a  host  ment  of  an  interaction  must  be  a  more 

view  mechanism,  based  on  relational  and  an  external  network  backbone,  detailed  description  of  the  interaction, 

database  technology,  is  planned.  This  view  says  that  a  host  reads  and  For  example,  if  a  picture  says  that  an 

Two  of  the  views  describe  the  two  writes  packets  by  means  of  subpro-  entity  “writes”  into  a  particular  data 

steps  in  interhost  communication —  grams  Rev  and  Snd,  respectively.  This  object,  then  refinements  of  the  notion 

namely,  establishment  of  a  commum-  interface  would  be  suitable  for  a  van-  of  writing  must  specify  one  of  the  pos- 

cation  link  (i.e.,  a  channel)  between  ety  of  network  configurations,  in-  sible  ways  in  which  writing  may  occur, 

hosts  (Figure  6b)  and  transmission  of  eluding  the  line  interface  in  Figure  5.  The  sequence  of  steps  performed  in 
an  actual  message  (Figure  6c).  In  Fig-  We  will  say  more  about  this  interface  refining  an  interaction  are  illustrated 

ure  6b,  a  sender  process  asks  a  data  later  when  we  paste  the  completed  ^  Figures  7  through  10.  The  user  first 

packet  protocol  to  open  a  channel  be-  host  design  into  our  network.  selects  a  dataflow  relation  D  (see 

tween  it  and  another  host.  (A  single  Figure  7).  Its  replacement  is  con- 

host  may  have  multiple  open  chan-  strutted  by  selecting  the  menu  com- 

nels.)  If  the  channel  is  successfully  Refinement  of  an  interaction.  We  mand  for  adding  a  relation  and  then 
opened,  the  variable  OK  has  the  value  have  seen  one  example  of  how  refine-  the  Write  relation  from  a  pop-up  menu 

true  and  chan  contains  the  name  of  the  ments  add  detail  to  an  existing  design  (see  Figure  8).  Note  that  the  dataflow 

open  channel.  If  the  attempt  to  imtiate  concept.  In  particular,  a  refinement  of  relation  has  disappeared  while  it  is  be- 

a  connection  fails,  OK  has  the  value  an  active  entity  adds  detail  in  the  sense  ing  refined.  The  Write  relation  takes 

false.  The  receiver  opens  a  channel  in  that  it  syntactically  elaborates  the  enti-  three  arguments,  two  of  which  are 

the  same  manner.  ty  and  preserves  interactions  in  the  selected  in  Figure  8.  The  two  selections 


Figure  7.  Selection  of  an  interaction  relation  for  refine-  Figure  8.  Selection  of  the  Write  relation  from  a  pop-up 

ment.  The  letter  D  is  an  abbreviation  for  a  dataflow  relation.  menu  to  replace  the  dataflow  relation  selected  in  Figure  7. 
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Figure  9.  Selection  of  an  argument  to  the  Write  relation.  Figure  10.  The  dataflow  relation  of  Figure  7  has  been 

The  cursor  has  changed  to  prompt  for  a  selection.  replaced  by  the  Write  relation  (abbreviated  as  W).  Valida¬ 

tion  of  this  refinement  required  a  logical  proof. 


Figure  11.  Selection  of  passive  entity  host _ pkt  with  a  Figure  12.  The  result  of  the  refinement  started  in  Figure  11. 

pop-up  comment  explaining  its  selected  refinement. 


are  the  bold  ellipse  (whose  name 
Line_Pkt_Protocol  is  occluded  by 
menus)  and  Send_Host_Pkt.  The  third 
argument  is  selected  in  Figure  9.  (The 
cursor  has  changed  to  let  the  user 
know  that  a  selection  is  required.) 
The  final  result  is  seen  in  Figure  10, 
which  specifies  that  a  data  object  of 
type  host_pkt  is  written  from 
Send_Host_Pkt  to  Line_Pki_Protocol. 

PegaSys  allowed  the  replacement  of 
the  DataFlow  relation  by  the  Write 


relation  because  it  was  able  to  prove  a 
certain  logical  relationship  between 
them.  Roughly  speaking,  the  refine¬ 
ment  of  an  interaction  is  said  to  add 
detail  if  the  interaction  is  a  logical  con¬ 
sequent  of  its  refinement.  This  logical 
relationship  is  verified  by  means  of  a 
(fully  automauc)  logical  proof. ' 


■This  procedure  acoiies  :o  any  derived  relation, 
while  the  active  entity  refinement  strategy  aoo'ies 
only  to  onmitive  active  entities. 


Refinement  of  a  passive  entity. 

Recall  that  a  passive  entity  is  a  data  ob¬ 
ject  manipulated  by  active  entities.  A 
data  object  is  characterized  by  a  name 
(which  is  used  to  refer  to  the  object) 
and  a  simple  or  structured  type  (which 
denotes  a  set  of  possible  values).  A 
passive  entity  refinement,  unlike 
refinements  described  earlier,  does  not 
necessarily  replace  an  existing  relation; 
it  usually  augments  a  partial  charac¬ 
terization  of  a  data  object.  The  sim¬ 
plest  example  is  the  addition  of  a  miss- 


30 
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function  SENQ.HQST^PACKET  (PkT  Pa  Ta  , PACKET*  renjm  Qx.r/Pg  n 
N  ;  INTEGER-, 

H.PKT;  HQST.PACKgT; 

O-PKT;  QaTA_PaCK£T; 

SUCCESS  ,  OK_TyPE 
BeQ>n 

Q_PKT  •  -  PKT. 

H.PKt  -  HQ  s  T_P  AC  Kg^^PKT.  DESTINATION,  MY  _MCf~ -NUMBER  ST*PT 
NULL-CATA_?AC/<ETX 

Ag^SgNO.LINE-PKT/H,PKT.  SUCCESS* 

t  not  SUCCESS  trign  return  <F  alSE  ena 

for  N  m  1  ..  (LEnGTH(PkT>-  1 V  MSG-P*  T_  S  HE  »  1 

loop 

D^PKT.DaTa  ;  »  PKT  DATAffN  ir^SG.PKT,S'ZE  «  1 

H.PKT>  HQST_PaCKE  t*(Pk  T  OESTinaTiQN.  my  _r«CST -NUMBER,  Data, 

‘  AB,SENO.L!Ng_PKT(H_PKT.  SUCCESS'. 

if  not  SUCCESS  men  return  (FalSEX  >rq  if: 

<n< y  loop; 

H,PKT  •  -  HQST.PAQKE^CPKT  DESTINATION.  MV  .HQST-N'  jMegQ,  C,NI. 

NUU QaTa.PaCKETV  __ 

AB^SgNQ.LlN£-PKT(Vl_PKT,  SuCCESSX.  —  .  - 

-  if  not  SUCCESS  tTon  ryfum  (f*LS EX  ana  £ 
return  (TRUE x 
<ra  S£nQ,mQST^PaCkET. 


function  RECElVE.HOST.PACKET  (PKT  ;  out  DATA.PACKgT; 

,  .  .  HOST.NUM0ER  .  MOST.NUMBER.TYPE. 

V  .  •  *  PORT :  NTEGER) 

-■  •-  return  OK  .TYPE  is 
;  H.PKT:  MOST_PaCKET; 

.  SUCCESS.  OK. TYPE:- FALSE;  . 

TIME OUT. INTERVAL  :  CALENOAR.TIME  ;  —  3;  ~ 

STAPT.TIME  ;  CALENOARTIME;  „  . 

.  b*gm  *•*  *  ’  "  ■'*  . 

•  r  not  FETCH.PACKETCH.PKT,  HOST.NUM0ER,  PORT)  men 
.  START.TIME  •  CAL ENOAR CLOCK; 

'•  .-wr»H«  not  (SUCCESS  or  CALENDAR. CLOCK  )  START. TIME  .  TIMEOUT.  WTERVai 
loop  SUCCESS  .  -  FETCM_PACKET(PKT.  hOST.NUMBER.  PORT*  ma  loop. 

»  SUCCESS  men  PKT  .  -  M.PKT.DATA;  end  (F. 
fwtu/n  (SUCCESS*  - 

\  ■  flee  PKT ;  •  M.PKT.DATA;  ....  '  2\  ' 

\  v  .  return  (TPuEX  - 

-  -  end  t-%  \*  ?  •’ .  ..  .**  T  ’  .7 

end  RECEIVE. HOST^>aCKET;  ! 
beg «n  .  ,  . 


Commana  incut  Amapw  1 


Figure  13.  Associating  entities  in  pictures  with 


mg  name  or  rvpe  to  a  data  object.  A 
more  complex  example  involves  the 
specification  of  ‘he  structure  of  a  glob¬ 
al  type.  (.All  types  are  global.)  It  is 
often  convenient  to  specify  different 
instances  of  the  same  type  differently. 
In  particular,  only  the  relevant  com¬ 
ponents  of  a  structured  data  object 
need  be  specified  for  each  instance  of 
the  object. 

The  refinement  of  a  single  instance 
of  a  tyne  is  illustrated  in  Figures  1 1  and 
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12.  In  Figure  11.  the  user  has  selected 
the  type  host_pkt  (indicated  by  the 
bold  rectangle),  constructed  the  re¬ 
fined  structure  (the  structType  relation 
at  the  present  position  of  the  cursor), 
and  entered  the  pop-up  comment  ex¬ 
plaining  the  relation.  Figure  12  shows 
the  completed  refinement.  This  refine¬ 
ment  of  host_pkt  into  a  four-tuple  ap¬ 
plies  only  to  the  selected  instance  of 
type  host_pkt.  Components  of  the 
host.pkt  structured  type,  such  as  host# 
and  pktjond,  can  be  further  refined 


into  structures  and  substructures  using 
the  structType  relation.* 

Sometimes  it  is  convenient  to  refine 
an  instance  of  a  simple  type  into  an¬ 
other  simple  type,  rather  than  a  struc- 


■|n  order  10  avoid  clutter  on  'fie  disciav.  simcie 
types  are  not  actually  replaced  Dv  structured  types 
in  pictures,  nor  examcie.  ,iost_pxt  was  not  actual¬ 
ly  reoiaced  in  the  picture  Dv  'he  tour  tuoie  descnto- 
mq  ts  structure  However,  f  'he  user  points  at 

host _ pKt  ion  the  arc  petween  Send_Host_°vt 

and  the  partially  occluded  eiliosei  ano  presses  a 
Dutton  on  the  mouse,  the  soecifiec  structure  s 
disoiaved  isee  Figure  tCV 
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Figure  14.  A  design  is  shared  through  its  interface  entities,  which  are  connected 
to  another  design  by  pointing. 


tured  type.  This  is  done  it  a  single  type 
is  used  to  denote  a  union  of  types.  For 
exam  pie,  we  use  the  type  pkt  in  our 
network  as  an  abstraction  for  two 
different  kinds  of  packets,  a  line 
packet  and  an  acknowledgment  packet. 
PegaSvs  allows  us  to  replace  pkt  by 
line_pkt  in  one  refinement  and  by 
ack^pkt  :n  another.  (Eventually,  line 
packet  line_pkt  is  refined  into  <  host?, 
host*,  seq,  host_pkt  >  .  This  packet  en¬ 


capsulates  host  packets  and  contains 
an  acknowledgment  bit  (of  type  seq) 
required  by  the  line-level  protocol.  An 
acknowledgment  packet  is  refined  into 
<  host?,  host?,  seq  > .) 

Reasoning  about  programs 

We  omit  the  development  of  the  re¬ 
maining  levels  in  the  host  hierarchy 


(which  are  described  in  reference  1) 
and  rejoin  the  session.  We  are  now 
ready  to  implement  the  host  in  Ada 
and  to  verify  that  its  implementation  is 
logically  consistent  with  the  host  de¬ 
sign.  Programs  are  written  interactive¬ 
ly  using  the  PegaSys  structure-oriented 
editor;  the  verification  process  does 
not  require  human  intervention  except 
to  establish  the  correspondence  be¬ 
tween  entities  in  a  picture  and  program 
constructs. 

Each  correspondence  is  specified  by 
two  structured  selections,  one  from  a 
picture  and  one  from  a  program.  This 
is  illustrated  in  Figure  13,  where  the 
user  has  selected  an  entity  called 
Send_Host_Pkt  (indicated  by  the  bold 
rectangle)  and  an  Ada  program  unit 
called  SEND_HOST_PACKET  (indi¬ 
cated  by  the  underlined  text).  Issuing 
the  .Associate  menu  command  (see  the 
cursor)  causes  PegaSys  to  record  the 
specified  association. 

PegaSys  requires  that  each  atomic 
entity — i.e.,  one  that  is  not  refined — 
must  be  associated  with  exactly  one 
program  construct.  Active  entities 
must  be  associated  with  program  units 
fin  the  case  of  Ada.  a  subprogram, 
package,  task,  or  generic)  and  passive 
entities  with  data  object  or  type 
declarations.  The  kind  of  an  entity 
determines  what  it  can  be  associated 
with.  This  association  may  occur  at 
any  stage  of  the  development  and  at 
any  level  in  a  design  hierarchy. 

Monatomic  entities  are  not  allowed 
to  be  associated  with  program  con¬ 
structs.  We  just  saw  that  the  type 
abstraction  called  pkt  was  repiaced  by 
.  ne_pkt  and  ack_pkt.  It  does  not 
make  sense  to  require  type  pkt  to  be 
represented  in  the  program,  only  that 
line_pkt  and  ack_pkt  be  represented. 
However,  in  general,  there  are  situa¬ 
tions  in  which  it  would  be  desirable  to 
associate  nonatomic  entities  with  pro¬ 
gram  constructs.  The  association 
would  have  to  be  restricted  based  upon 
properties  of  the  refinement  history. 
PegaSys  maintains  this  history  but 
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Figure  15.  Hosts  are  marked  to  indicate  reuse  of  an  extant  development. 


dees  net  as  >  er  provide  this  capability. 

Once  a  program  ce  struct  has  been 
associated  with  ever.’  atomic  entity  in  a 
hierarch) .  PegaSys  can  attempt  to 
prove  that  the  program  and  the  hierar¬ 
chs  are  logically  consistent.  That  is, 
Pegaiy  s  proses  that  the  lowest  level  in 
a  hierarchy  is  logically  consistent  with 
the  program  it  is  intended  to  describe. 
(This  does  not  mean  that  an  entire 
hierarchy  is  consistent  with  a  program. 
PegaSys  shows  that  every  level  in  a 
hierarchy  foilows  from  the  immediate¬ 
ly  preceding  :evel  by  valid  applications 
of  our  refinement  rules  and  that  the 
lowest  ievei  is  consistent  with  the  pro¬ 
gram  it  is  intended  to  describe.)  The 
PegaSys  proof  procedure  has  the  fol¬ 
lowing  two  important  characteristics: 
(1)  properties  of  nested  program  units 
are  inner. ted  by  their  parents  and  (2) 
specified  interactions  can  be  satisfied 
m  a  var.ety  of  ways  by  an  implementa¬ 
tion.1  Without  these  considerations, 
impractical  constraints  would  be 
placed  on  an  implementation. 

It  should  be  pointed  out  that 
PegaSys  is  actually  proving  that  a  pic¬ 
ture  is  logically  consistent  with  a  pro¬ 
gram  under  a  reasonable  interpreta¬ 
tion  of  the  program.  PegaSys  presently 
assumes  that  the  consistency  between  a 
picture  and  the  program  it  is  intended 
to  descr.be  does  not  depend  upon  cer¬ 
tain  proper.ies  of  its  implementation. 
For  example,  it  assumes  that  consis¬ 
tency  does  not  depend  upon  “dead” 
controi  paths  or  aliasing  of  names  in 
the  same  context.  For  the  sorts  of 
properties  described  by  pictures  in 
PegaSys.  the  assumptions  appear  to  be 
reasonable  and  to  coincide  directly 
with  our  intuitive  model  of  what  such 
proofs  shouid  mean.  These  assump- 
uons,  together  with  the  decidability  of 
the  form  caicuius,  enable  PegaSys  co 
fully  mechanize  consistency  proofs. 

Reuse  of  a  hierarchy 

Having  completed  the  host  (its 
design,  implementation,  and  verifica¬ 


tion),  we  would  like  to  reuse  it  four 
times  in  Figure  5  with  minimal  redup¬ 
lication  of  previous  work.  Beiow,  we 
refer  to  the  presently  active  develop¬ 
ment  as  the  primary  one  and  the 
development  that  we  intend  to  reuse  as 
the  secondary  one.  At  this  stage  of  our 
example,  the  network  is  primary  and 
the  host  secondary'.  We  First  consider 
reuse  of  the  host  design  then  its  im¬ 
plementation. 

There  is  a  simple,  vet  useful,  wav  to 
connect  primary  and  secondary  de¬ 
signs.  .An  atomic  active  entity  in  a 
primary  hierarchy  may  be  replaced  by 
an  entire  secondary  hierarchy  pro¬ 
vided  that  ( 1 )  the  atomic  active  entities 
that  serve  as  interface  to  the  secondary 
hierarchy  are  “matched  up”  with  ac¬ 
tive  entities  of  the  primary  hierarchy 
and  (2)  interactions  with  the  replaced 
entity  of  the  primary  design  arc  pre¬ 
served  (in  the  same  sense  as  with  active 
refinements). 

This  procedure  is  illustrated  in 
Figure  14.  The  top  window  shows  the 
lowest  level  of  the  network  hierarchy, 
and  the  bottom  one  shows  the  highest 


level  of  the  host  hierarchy.  We  want  to 
replace  each  of  the  host  entities  in  the 
network  (which  are  atomic)  with  the 
entire  host  hierarchy.  Recall  from 
Figure  6d  that  a  host  interfaces  with  a 
network  backbone  by  means  of  the 
atomic  Snd  and  Rev  entities.  The  Snd 
and  Rev  entities  of  the  host  are 
associated  ("by  pointing)  with  Snd  and 
Rev  of  the  network,  respectively.  This 
pairing  is  done  four  times,  once  for 
each  reuse  of  the  host.  In  Figure  14,  the 
user  has  started  the  senes  of  painngs 
by  selecting  the  leftmost  Snd  sub- 
pr  .  the  network  and  the  Snd 
subprogram  of  the  host  interface.* 
Figure  15  snows  the  final  result.  A 
double-ringed  ellipse  has  been  drawn 
around  the  network  hosts  to  indicate 
their  connection  to  another  design 
hierarchy. 

Reusability  ot  an  implementation  is 
achieved  by  direct  sharing  of  interface 


*  Things  do  not  aiwavs  wor*  out  as  ’ortuitousv  as 
in  ’his  examoie.  in  particular  menaces  oetwe«n 
designs  ao  net  aiwavs  nave  'centtcat  nteracticns 
In  this  event,  't  is  sometimes  ocssiCie  to  introduce 
a  'dummy  '  entity  that  serves  as  an  menace  oe- 
tween  the  two  designs. 


code.  For  example,  the  implementa¬ 
tion  of  Snd  for  the  host  must  be  iden¬ 
tical,  possibly  with  renaming,  to  the 
implementation  of  Snd  for  the  net¬ 
work.*  This  is  not  as  restrictive  as  it 
might  sound.  Most  often,  the  interface 
of  the  secondary  reusable  implementa¬ 
tion  consists  of  code  skeletons  involving 
only  headers  for  program  units  needed 
in  the  verification  of  the  secondary 
design.  Note  that,  under  this  condition, 
reverification  of  the  secondary  imple¬ 
mentation  is  unnecessary. 

At  this  point,  the  reader  should  not 
be  misled  into  thinking  that  PegaSvs 
always  avoids  unnecessary  work.  In 
fact.  PegaSys  presently  is  not  incre¬ 
mental  and  duplicates  work  in  many 
commonly  occurring  situations.  In  this 
example,  the  secondary  host  design 
and  implementation  are  reused  before 
the  network  has  been  implemented 
and  verified.  PegaSys  would  have  to 
reverify  the  entire  network  (except  for 
the  host)  if  the  network  had  been  veri¬ 
fied  before  reuse  of  the  host.  Our  first 
priority  has  been  to  develop  the  basic 
capabilities  of  PegaSys,  and  we  are 
now  beginning  to  consider  the  prob¬ 
lems  of  incremental  analysis. 

Having  completed  ihe  host  and 
reused  it  in  the  development  of  the  net¬ 
work,  the  remaining  task  is  to  design 
and  imolement  the  physical  line 
module.  .As  the  rest  of  the  session 
follows  the  pattern  of  development 
already  described,  we  omit  it  here. 

PegaSys  is  an  experimental  system 
that  we  plan  to  extend  in  a 
number  of  ways.  One  area  in  which  it 
is  presently  lacking  involves  the  rep¬ 
resentation  of  persistent  data  and  data 
dependencies,  both  of  which  arise  in 
database  applications.  We  expect  to 
add  several  new  capabilities,  such  as 
incremental  analysis  of  changes,  a 
sophisticated  view  mechanism,  and  a 
dynamic  animation  and  testing  facili- 


•The  oresent  mDiementation  reouires  identical 
*,arres. 


ty.  .Animations  would  display  partic¬ 
ular  execution  states  in  terms  of  hierar¬ 
chical  pictures  of  the  program  design. 
In  Figure  5,  for  example,  we  might 
show  a  packet  flow  from  a  host  to  the 
line  whenever  the  “write”  relation  is 
satisfied. 

WTiile  the  precision  and  descriptive 
capability  of  pictures  has  legitimately 
been  questioned  in  the  past,  PegaSys 
seems  to  suggest  that  it  is  possible  to 
profitably  combine  both  graphics  and 
logic  for  a  rich  domain.  Our  limited 
experience  suggests  that  PegaSys 
makes  techniques  more  palatable  to 
program  developers.  It  is  our  expecta¬ 
tion  that  such  uses  of  graphics  will  lead 
to  the  utilization  of  formal  documen¬ 
tation  and  analysis  techniques  by  a 
wide,  possibly  mathematically  un¬ 
sophisticated  audience.  Z 
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Abstract 

A  logical  technique  is  presented  for  approximating  the  semantic  effects  of  a 
change  to  a  software  system.  The  new  method  uses  proofs  about  a  system’s 
structure  to  obtain  after  finitely  many  steps  results  that  Eire  reasonably  close 
to  optimal.  The  technique  is  general  enough  to  apply  to  implementations  and 
to  formal  specifications  of  abstract  design  structures,  provided  that  abstrac¬ 
tions  are  defined  in  terms  of  certain  predefined  primitives.  An  experimental 
implementation  has  been  completed  in  Prolog. 

1  Introduction 

Programmers  axe  continually  faced  with  the  problem  of  modifying  an  existing  or 
partially  developed  system.  Modifications  must  be  made  for  several  reasons:  to 
correct  erroneous  behavior,  to  increase  functionality,  to  adapt  to  a  new  environment, 
and  to  improve  a  correct  implementation.  A  major  difficulty  in  modifying  a  system, 
especially  a  large  one,  is  that  a  seemingly  minor  change  to  one  part  of  the  system 
can  have  unforeseen  and  subtle  consequences  in  another  part.-  As  a  result,  the 
incremental  cost  of  modifications  is  often  unacceptably  high. 

In  this  paper,  we  formalize  the  basic  question  of  whether  a  change  to  one  system 
object  can  affect  another  system  object,  and  we  present  a  logical  technique  for 
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answering  a  class  of  questions  that  can  be  reduced  to  instances  of  this  basic  question. 
Our  approach  has  the  following  important  properties: 

•  Potential  changes  axe  analyzed  with  respect  to  any  level  in  system’s  design, 
whether  the  level  contains  abstract  or  concrete  objects  and  connections.  Early 
feedback  on  the  effects  of  changes  is  provided  whenever  design  decisions  are 
formalized  in  a  certain  formal  language. 

•  The  semantic  effects  of  a  change  axe  inferred  from  structural  properties  of 
a  system.  Our  approach  does  not  depend  upon  the  presence  of  behavioral 
specifications,  which  can  be  difficult  to  construct.  However,  it  does  depend 
on  the  presence  of  structural  information,  which  can  be  recorded  explicitly  by 
a  programmer  or  extracted  mechanically  from  the  system  implementation. 

•  A  good  approximation  of  the  objects  affected  by  a  change  can  be  found  in  a 
finite  number  of  proof  steps.  In  practice,  this  means  that  proofs  can  be  fully 
mechanized,  which  will  make  it  much  easier  to  integrate  our  formal  approach 
into  everyday  software  development. 

We  axe  awaxe  of  no  other  approach  to  the  analysis  of  changes  that  is  general, 
rigorous,  and  fully  mechanizable. 

In  this  paper,  a  design  is  a  logical  description  of  how  a  system  is  put  together, 
i.e.,  its  interesting  objects  and  the  relevant  structural  relationships  among  them. 
This  view  of  design  focuses  on  the  transition  from  a  (formal)  specification  of  what 
a  system  is  intended  to  do  to  an  implementation  that  describes  how  to  perform 
the  specified  computation.  This  what-to-how  transformation  should  be  recorded 
explicitly  because  an  implementation,  especially  an  efficient  one,  will  invariably  be 
highly  interconnected  and  somewhat  unstructured.  Designs  will  tend  to  be  more 
modular  and  easier  to  understand,  since  performance  is  not  as  much  of  a  concern. 

Each  level  in  a  design  can  contain  user-defined  relations  that  denote  abstractions 
of  concrete  objects  and  connections.  Typically,  an  abstraction  can  be  realized  in 
multiple  ways.  For  instance,  “operation”  is  an  abstract  object  that  might  be  realized 
by  a  concrete  procedure  or  function.  The  abstract  relation  “connected  to”  might  be 
realized  be  various  combinations  of  concrete  procedure  calls  and  data  accesses  and 
modifications.  The  lowest  level  in  a  design  hierarchy  directly  models  the  structure 
of  the  implementation;  this  level  can  be  derived  from  the  implementation  using 
classical  flow  analysis  techniques.  A  design  can  be  shown  to  be  logically  consistent 
with  an  implementation  using  the  technique  in  [12]. 

It  is  undecidable  in  general  to  determine  the  exact  behavioral  effects  of  a  change, 
but  it  is  possible  to  obtain  a  precise,  conservative  approximation  by  formalizing  the 
problem  in  terms  of  structural  concepts.  In  particular,  we  say  that  a  change  to 
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an  object  z  affects  an  object  y  if  the  pair  (z,  y)  is  in  the  transitive  closure  of  the 
“information  flow”  relation.  Unfortunately,  information  flow  is  not  transitive  in  the 
usual  sense.  That  is,  if  there  is  flow  from  some  object  x  to  an  object  y  and  from  y  to 
an  object  z,  there  is  not  necessarily  flow  from  x  to  z.  As  a  consequence,  the  usual 
notion  of  transitivity  gives  a  crude  approximation  of  the  effects  of  a  change.  To 
obtain  a  more  accurate  approximation  of  the  true  transitive  flows  (i.e.,  those  that 
would  occur  when  the  system  is  executed),  we  decompose  the  concept  of  information 
flow  into  three  special  flows  and  provide  axioms  for  composing  the  three  flows  to 
determine  the  transitive  closure  of  the  information  flow  relation. 

Since  the  effects  of  changes  are  analyzed  in  terms  of  information  flows,  the  flows 
that  are  implicit  in  a  structural  specification  must  be  enumerated.  For  example,  if  a 
concept  in  a  specification  is  defined  in  terms  of  the  primitive  mod(P,x),  which  says 
that  variable  z  is  modified  by  execution  of  procedure  P,  we  know  that  there  must  be 
some  variable  v  whose  value  is  assigned  to  z.  Hence,  there  is  an  implicit  flow  from 
v  to  z.  Axioms  are  defined  for  deriving  implicit  flows  from  specifications  written  in 
our  language.  To  substantially  reduce  the  size  of  a  specification,  system  interactions 
that  are  not  intended  to  occur  need  not  be  specified  (implicitly  or  explicitly).  These 
interactions  are  dealt  with  by  means  of  the  closed-world  assumption  [13]. 

All  axioms,  specifications,  and  questions  about  changes  are  represented  in  a 
first-order  logic  restricted  to  finite  models.  Consequently,  the  proofs  required  to 
answer  questions  contain  no  infinite  search  paths.  This  is  a  direct  consequence  of 
the  decision  to  take  a  structural  approximation  of  a  semantic  property.  The  same 
decision  has  led  to  convergent  algorithms  for  program  flow  analysis  in  the  field  of 
program  optimization  [l]. 

The  remainder  of  this  paper  is  organized  as  follows.  The  next  section  illustrates 
the  problem  that  is  the  focus  of  the  paper  and  defines  it  more  precisely;  the  following 
section  describes  related  work.  Our  solution  is  formalized  in  Sections  4-7,  which 
present,  respectively,  our  system  model,  the  axioms  for  inferring  the  flows  implicit 
in  structural  descriptions,  our  axiomatization  of  the  transitivity  of  information  flow, 
and  the  logical  framework  for  stating  and  answering  questions  about  the  effects  of 
changes.  Section  8  summarizes  our  results. 

2  Statement  of  the  Problem 

To  illustrate  the  problem  we  address,  consider  the  low-level  design  presented  in 
Figure  la.  The  design  consists  of  several  objects:  procedures  Addlnc,  Add,  and 
Inc ;  variables  sum,  t,  a,  b,  and  z;  and  the  constant  1.  Parameters  are  transmitted 
using  a  value-result  semantics.  The  purpose  of  procedure  Addlnc,  which  is  not 
specified  in  the  figure,  is  to  add  the  initial  values  of  i  and  sum  and  return  the  result 


PROCEDURE  Addlnc{sum,i) 

ASSERT  call(Add,(sum,i))  AND  call(lnc,(i)) 

END;  sum  i 


PROCEDURE  Add(a.b) 

ASSERT  affects(a.a)  AND  affects(b.a) 
END; 

PROCEDURE  lnc(z) 

ASSERT  call(Add,(z,1)) 

END; 


Addlnc 
/  \ 

/  \ 

/  \ 

t  \ 

Add  - Inc 


(a) 


(b) 


(c) 


Figure  1:  A  low-level  design:  (a)  textual  representation,  (b)  diagram  of  implicit 
information  flows,  and  (c)  graph  of  call  relationships. 

in  sum;  it  also  increments  the  initial  value  of  i  and  returns  the  result  'in  i . 

We  are  interested,  for  instance,  in  whether  or  not  a  change  to  the  value  of 
variable  sum  can  affect  the  value  of  variable  z.  At  first  glance,  there  appears  to 
be  a  relatively  simple  way  to  formalize  this  question  in  terms  of  the  concept  of 
“information  flow.”  Informally,  we  say  that  there  is  information  flow  from  variable 
x  to  variable  y,  denoted  x  ==>  y,  if  some  change  in  the  value  of  x  affects  the  value 
of  y  when  the  program  is  executed.1  For  example,  execution  of  the  assignment 
statement  “6:  =  a”  causes  flow  from  a  to  b.  Since  two  objects  can  interact  indirectly 
through  any  number  of  intermediaries,  the  question  is  not  whether  sum  =>  z,  but 
instead  it  is  whether  the  pair  (sum,  z)  is  in  the  transitive  closure  of  =>  on  the  set 
of  all  variables,  written  sum  z? 

We  cannot  form  the  transitive  closure  of  =>  until  the  information  flow  relation¬ 
ships  implicit  in  the  assert  statements  are  made  explicit.  The  assertion  for  Addlnc 
says  that  it  makes  two  calls,  one  to  Add  and  one  to  Inc.  (The  ordering  of  the  calls 
is  unspecified  for  the  moment.)  The  assertion  for  Add  says  that  the  initial  values 
of  a  and  6  affect  some  future  value  of  a.  The  assertion  of  Inc  specifies  that  it  calls 

1  Classical  information  theory,  developed  by  Shannon  [15]  and  others,  is  concerned  with  the 
amount  of  information  generated  by  a  particular  event.  We  are  interested  in  the  simpler  qualitative 
question  of  whether  any  information  is  generated  by  an  event.  In  other  words,  we  are  interested  in 
whether  a  change  affects  an  object  at  all,  not  in  how  much  it  affects  it. 

2 Let  S  be  a  set  and  R  a  relation  on  S.  Relation  R  is  transitive  if  uaRb  and  bRc”  implies  “ aRc ” 
for  a,  b,  and  c  in  S.  Elements  a,  b,  and  c  need  not  be  distinct.  The  transitive  closure  of  a  relation 
R  on  a  set  S  will  be  denoted  R*.  We  say  that  aR*b  if  there  exists  a  sequence  $i,  s?, . . . ,  sn  of  zero 
or  more  elements  in  S  such  that  aRs^,  SiRs^, . . . ,  sn_i Rsn,  snRb. 
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Add. 

Figure  lb  depicts  the  information  flow  relationships  implicit  in  this  structural 
description.  Flows  not  in  the  figure  are  assumed  to  be  invalid,  due  to  the  closed- 
world  assumption  discussed  later.  For  example,  it  is  assumed  that  there  is  no  flow 
in  Add  from  a  to  b,  making  b  a  read-only  variable  of  Add.  Procedure  calls  normally 
cause  bidirectional  flow  between  actual  and  formal  parameters.  However,  the  two 
calls  to  Add  cause  only  unidirectional  flow  from  actual  parameters  i  and  1  to  formal 
parameter  b,  since  the  value  of  b  is  unchanged  by  Add. 

Returning  to  our  original  question  about  the  possibility  of  flow  from  sum  to  z, 
we  can  use  the  diagram  Figure  lb  to  trace  the  information  flow  path 

sum  ==>  a  =>  z  (1) 

from  which  we  can  infer  that  sum  ==»  z.  This  kind  of  reasoning  can  be  formalized 
in  terms  of  the  usual  transitivity  axiom. 

Unfortunately,  this  simple  analysis  is  much  too  conservative.  A  change  to  the 
value  of  sum  cannot  affect  z  if  there  is  no  execution  sequence  for  which  sum  =>■  z. 
We  can  establish  that  there  is  no  such  sequence  in  our  example  specification  by 
relating  the  information  flow  relationships  in  the  specification  to  its  calling  rela¬ 
tionships,  which  are  illustrated  in  Figure  lc.  Consider  the  call  from  Addlnc  to  Add. 
Procedure  Add  contains  no  procedure  calls  and  it  does  not  allow  the  value  of  formal 
a  to  affect  the  value  of  its  other  formal  6.  Therefore,  the  call  from  Addlnc  to  Add 
can  only  affect  the  value  of  sum  by  means  of  the  path 

sum  =>  a  =>  sum 

Procedure  Addlnc  also  calls  Inc  with  actual  parameter  i.  But  since  i  is  never  affected 
by  sum  (by  the  closed-world  assumption),  the  call  from  Addlnc  to  Inc  cannot  result 
in  a  flow  from  sum  to  z,  from  which  we  can  conclude  that  the  call  from  Inc  to  Add 
cannot  either.  This  completes  an  informal  argument  that  -'(sum  ==>  z ). 

The  main  purpose  of  this  paper  is  to  formalize  this  kind  of  reasoning.  To  do 
so,  a  new  axiomatization  of  the  transitivity  of  information  flow  is  required  in  which 
a  transitive  flow  is  inferred  from  two  individual  flows  only  when  there  is  a  causal 
relationship  between  the  individual  flows.  That  is,  we  must  establish  that  some 
change  in  the  value  of  variable  x  in  the  flow  x  ==>  y  causes  a  change  in  the  value 
of  variable  z  in  the  flow  y  =>  z  before  we  can  infer  the  transitive  flow  x  =>  z. 
Let  T  denote  a  logical  axiomatization  of  transitive  information  flow  that  takes  into 
account  the  causal  relationships  among  individual  flows.  In  addition,  let  S  denote 
a  structural  specification,  and  let  I  denote  axioms  for  inferring  the  flows  implicit 
in  specifications.  To  reason  about  changes,  we  must  define  T  and  I  so  that  the 
transitive  closure  of  =r>,  namely, 

{(x,y)|  T  U  I  U  5  I-  x  =>  y}, 
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contains  the  true  information  flows  in  S.  Moreover,  the  derivation  of  the  closure 
should  terminate  for  any  specification  S.  The  transitive  closure  with  respect  to  a 
given  specification  S  serves  as  the  basis  for  answering  a  class  of  questions  about 
changes  to  S. 

3  Related  Work 

In  practice,  the  effects  of  changes  are  usually  determined  informally.  For  example,  a 
programmer  might  try  to  find  the  effects  of  a  change  by  studying  various  relations 
extracted  from  the  program  itself,  such  as  direct  (cross-reference)  relations  and 
transitive  relations  about  calling  relationships  and  data  flows.  Numerous  tools  have 
been  developed  which  derive  such  relations,  but  they  are  used  primarily  for  other 
purposes,  such  as  program  optimization  [3],  the  detection  of  simple  errors  [4],  and 
documentation  [10].  We  are  not  aware  of  any  existing  tool  that  provides  a  systematic 
way  of  combining  the  relations  to  determine  the  effects  of  changes. 

Moriconi  [11]  proposed  a  formal  approach  to  the  analysis  of  changes  that  requires 
the  proof  of  formulas  in  an  undecidable  theory.  From  a  behavioral  specification  of 
a  system  and  a  Hoare-style  logic,  his  method  determines  what  formulas  must  be 
proved  to  isolate  the  exact  semantic  effects  of  incremental  changes.  The  drawback 
of  this  approach  is  that  the  effects  of  most  changes  cannot  be  found  without  sub¬ 
stantial  human  assistance  in  proofs.  The  approach  presented  in  this  paper  sacrifices 
exactness  but  has  the  important  advantage  of  logical  simplicity. 

Recent  work  by  Horwitz,  Prins,  and  Reps  [7]  on  interprocedural  program  slicing 
addresses  the  related  problem  of  finding  all  program  objects  that  might  affect  a 
distinguished  object.  They  improve  on  the  interprocedural  slicing  algorithm  of 
Weiser  [18]  by  defining  a  structural  approximation  of  a  slice  in  terms  of  operations 
on  an  extended  program  dependence  graph.  The  problem  of  computing  slices  is  an 
instance  of  the  more  general  problem  addressed  in  this  paper.  That  is,  our  logical 
framework  can  be  used  to  compute  slices  involving  various  kinds  of  concrete  and 
abstract  objects.  For  example,  the  variables  x  that  can  affect  a  specific  variable  v 
is 

{(x,u)|  T  U  I  U  s  H  X  =>  u}, 

a  subset  of  the  closure  discussed  earlier.  If  we  are  interested  in  program  slices,  S 
consists  of  those  instances  of  our  primitive  that  are  true  of  the  program.  A  slice 
computed  using  our  logical  system  appears  to  be  as  precise  as  one  computed  using 
the  improved  algorithm  of  Horwitz,  et  al. 

Somewhat  related  is  work  on  configuration  management  (e.g.,  [5,17])  that  uses 
generic  dependency  relations  between  compilation  units  to  determine  what  units 
must  be  recompiled  following  a  change.  The  dependency  relations  are  not  powerful 


enough  for  system  design  and  a  semantic  property  is  not  maintained.  The  goal  is 
simply  to  ensure  that  a  system  is  properly  compiled. 


4  System  Model 

Our  primitives  have  been  designed  for  describing  a  system  that  consists  of  a  collec¬ 
tion  of  procedures  which  communicate  by  parameters  that  are  passed  by  value-result 
(copy- in/copy-out).  A  value-result  semantics  precludes  the  possibility  of  aliasing. 
We  treat  only  scalar  types  and  assume  that  objects  (procedures,  variables,  and 
constants)  have  unique  names.  Modelling  additional  features,  such  as  structured 
types,  aliasing,  shared  global  variables,3  and  modules,  will  require  some  adaptation; 
however,  we  believe  that  our  basic  approach  is  applicable. 

The  formal  system  model  presented  in  this  section  should  be  distinguished  from 
a  language  for  writing  structural  specifications.  The  system  model  can  be  seen  as  a 
common  internal  representation  for  a  class  of  specification  languages.  To  illustrate 
the  difference,  consider  the  call  relation  in  the  specification  of  Figure  la.  It  takes  as 
arguments  the  called  procedure  and  the  set  of  actual  parameters,  whereas  the  cor¬ 
responding  relation  in  the  system  model  is  more  verbose,  taking  as  arguments  the 
calling  procedure,  the  called  procedure,  the  actual  parameters,  the  formal  parame¬ 
ters,  and  the  actual-formal  pairings.  In  this  paper,  we  do  not  propose  a  structural 
specification  language;  instead,  we  concentrate  on  the  underlying  system  model. 
Henceforth,  a  structural  specification  is  taken  to  be  a  logical  expression  in  the  sys¬ 
tem  model  unless  stated  otherwise.  The  signature  of  each  primitive  in  the  model  is 
contained  in  Figure  2. 


4.1  Primitive  Objects 

The  basic  objects  in  a  design  are  variables  and  procedures,  which  are  of  type  var 
and  proc ,  respectively.  A  special  kind  of  variable,  called  a  version  variable,  is  used 
to  represent  a  value  of  an  ordinary  variable.  Every  time  the  value  of  an  ordinary 
variable  is  changed,  a  new  version  variable  is  introduced.  A  version  variable  has 
type  war.  The  standard  mathematical  concepts  of  boolean,  finite  sets,  and  finite 
sequences  are  also  predefined  types. 

The  version  variables  that  can  be  used  to  specify  the  values  of  a  variable  must 
be  explicitly  associated  with  that  variable.  To  this  end,  a  finite  sequence  of  distinct 
version  variables  is  associated  with  each  ordinary  variable.  The  elements  of  such  a 
sequence  represent  the  successive  values  of  the  associated  variable.  A  sequence  for 

3 In  our  present  model,  global  variables  can  be  represented  as  additional  parameters  to  each 
procedure.  However,  an  explicit  representation  of  globals  is  preferable. 
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Primitive  Types 
var 
war 
proc 
Context 

context:  name  — ►  type  x  formals  x  locals  x  versions 
Primitive  Predicates  (context  argument  left  implicit) 

=£•/:  proc  x  war  x  war  — *  bool 
=>4:  proc  x  war  x  war  — *  bool 
=>1 :  proc  x  war  x  war  —>  bool 

callByVR:  proc  x  proc  x  (war  x  var)"  — ►  bool,  n  >  0 
mod:  proc  x  var  —*  bool 
acc:  proc  x  var  — ►  bool 


Figure  2:  System  Model. 


a  variable  z  nmst  contain  one  element  for  each  relevant  change  to  x.  A  specifica¬ 
tion  need  not  distiguish  among  the  different  values  of  a  variable.  In  this  event,  a 
single  version  variable  is  used  to  model  all  values.  Failure  to  distinguish  among  the 
different  values  will  result  in  a  more  conservative  approximation  of  the  effects  of  a 
change. 

We  use  the  sequence  operations  first  and  last  to  return  the  first  and  last  element, 
respectively,  of  a  finite  sequence.  We  also  use  functions  that,  when  supplied  with  an 
element  of  a  sequence,  return  the  preceding  and  the  next  elements  in  the  sequence. 
For  an  element  e  of  a  finite  sequence  s,  the  functions 

next:  war  — >  war 


and 

are  defined  by 


prev:  war  — >  war 

prev(next(e,  s),s)  =  e 

prev(first(s))  =  first(s) 
next(last(s))  =  last(s) 


Example  1  The  need  for  version  variables  can  be  illustrated  by  returning  to  the 
design  in  Figure  1  and  asking  whether  or  not  z  ==>  sum.  Intuitively,  a  change  in 
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the  value  of  z  should  not  affect  sum  because  i  must  be  incremented  after  it  is  added 
to  sum.  This  means  that  procedure  Addlnc  should  call  Add  before  Inc.  Version 
variables  can  be  used  to  specify  this  ordering. 

If  order  is  left  unspecified  (as  in  Figure  la)  and  Addlnc  calls  Add  before  Inc , 
the  call  to  Add  gives 

sum  =>  a  =>  sum 
i  =>  b  =>  a  =>  sum 

and  the  call  to  Inc  gives 


1 


z 

z 


from  which  we  conclude  that  ~^(z 


sum).4  Reversing  the  order  of  calls  gives 


a 


sum 


(2) 


which  implies  that  z  ==>  sum. 

Figure  3  shows  how  version  variables  can  be  used  to  achieve  the  desired  ordering 
of  calls.  The  key  is  in  the  assertion  for  Addlnc ,  where  the  same  input  value  i*n  is 
transmitted  to  both  Add  and  Inc.  The  value  iout  returned  by  Inc  is  different  from 
iin  and  therefore  iout  can  never  be  transmitted  to  Add.  This  is  illustrated  by  the 
information  flow  diagram  in  Figure  3b  in  which  path  (2)  cannot  occur.  We  can 
trace  a  path  from  z  to  tout,  namely, 


Zin 


aout 


Zout 


*out 


but  there  is  no  path  from  iout  to  t,n  and,  hence,  no  path  to  sum.  □ 


4.2  Contexts 

An  object  has  a  name,  a  type,  and  possibly  certain  other  properties.  For  instance,  a 
procedure  object  has  a  name,  the  type  proc,  and  a  set  of  formal  parameters.  Objects 
are  modelled  by  a  function  called  a  context ,  which  would  be  derived  from  declara¬ 
tions  written  in  a  structural  specification  language  or  a  programming  language. 
Contexts  are  used  in  the  evaluation  of  logical  expressions. 

Formally,  a  context  for  a  specification  5  is  a  function 

context:  names  — ♦  types  x  formals  x  locals  x  versions 


where 

4The  path  i  =>  z  =>  a  =>  sum  is  not  a  possibility  because  there  is  no  causal  relationship 
between  t  =>  z  =>  a  and  a  =>  sum.  This  would  be  detected  by  the  transitivity  axioms  of 
Section  6. 
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PROCEDURE  Addlnc(sum,i) 
ASSERT  call(Add,(sumJn,i_in))  AND 
call(lnc,(i_in)) 

END; 


PROCEDURE  Add(a,b) 

ASSERT  affects(aJn,a_out)  AND 
affects(bJn,a_out) 

END; 

PROCEDURE  lnc(z) 

ASSERT  cal!(Add,(z_in,1); 

END; 

(a) 


sum  in  sum  out 


i . 
in 


out 


(b) 


Figure  3:  Using  version  variables  to  order  calls  from  Addlnc  to  Add  and  Inc. 

•  names  is  a  finite  set  containing  the  object  names  in  S, 

•  types  is  a  finite  set  containing  the  primitive  and  constructed  types  in  S , 

•  formats  is  the  finite  powerset  of  the  set  of  variables  in  S,  used  for  recording 
the  parameter  lists  of  procedures, 

•  locals  is  the  finite  powerset  of  the  set  of  variables  in  S ,  used  for  recording  the 
local  variables  of  procedures,  and 

•  versions  is  a  finite  set  whose  elements  are  the  version-variable  sequences  in  S. 

A  context  is  fixed  for  a  given  specification.  In  this  paper,  the  context  associated 
with  a  specification  S  is  treated  as  an  implicit  argument  of  every  predicate  in  S, 
including  the  primitive  predicates.  For  instance,  let  C  be  the  context  for  the  spec¬ 
ification  under  consideration  and  let  the  function  allprocs,  when  supplied  with  a 
context,  return  the  set  of  all  procedure  names  in  it.  Then,  the  expression  (Vx:proc) 
is  written  as  a  shorthand  for  (Vx  £  allprocs(C)). 

The  following  operations  on  contexts  are  used  later  in  the  paper.  The  predicate 
versionOf  is  a  mapping  war  x  var  — >  bool  that  checks  whether  a  version  variable 
is  a  version  of  an  ordinary  variable.  The  predicate  formalOf  ( localOf)  is  a  mapping 
var  x  proc  — ►  bool  that  checks  whether  a  variable  is  a  formal  (local)  of  a  procedure. 
It  is  often  convenient  to  ask  whether  a  given  variable  or  version  variable  is  a  variable 
of  a  procedure.  The  predicate 

varOf:  (var  +  war)  x  proc  — +  bool 
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is  defined  by 


varOf(x,  P)  =  localOf(x,P)  V  formalOf(x,  P) 
varOf (x,P)  =  (3x:var)[versionOf(x,  x)  A  varOf(x,  P)] 

where  the  hat  denotes  a  version  variable.  The  only  retrieval  operation  we  will  use  is 
the  function  versions,  which  returns  the  version-variable  sequence  associated  with 
a  given  variable. 

Example  2  The  declarations  in  the  specification  of  Figure  3  define  a  context  that 
can  be  represented  in  tabular  form: 


name 

type 

formals 

locals 

versions 

Addlnc 

proc 

0 

{sum,  t} 

0 

sum 

var 

0 

0 

(sumin,  sum0Ut) 

sum,„ 

war 

0 

0 

0 

sum0Ut 

war 

0 

0 

0 

i 

var 

0 

0 

(*»'»>  iout) 

*in 

war 

0 

0 

0 

iout 

war 

0 

0 

0 

Add 

proc 

(a,  6} 

0 

0 

a 

var 

0 

0 

(aini  aout) 

O-in 

war 

0 

0 

0 

aou  t 

war 

0 

0 

0 

b 

var 

0 

0 

(bin) 

bin 

war 

0 

0 

0 

Inc 

proc 

{*} 

{1} 

0 

z 

var 

0 

0 

(Z\ny  Zout) 

Zin 

war 

0 

0 

0 

Zout 

wax 

0 

0 

0 

1 

var 

0 

0 

Urn) 

1  in 

war 

0 

0 

0 

Notice  that  every  object  has  a  name  and  a  type.  The  other  properties  of  an  object 
depend  on  its  type  and  the  specification  itself.  A  constant,  such  as  the  number  1, 
is  modelled  as  a  variable  having  exactly  one  version  variable.  □ 

4.3  Directed  Information  Flow 

In  our  system  model,  information  flows  are  associated  with  procedures.  Informally, 
we  say  that  information  flows  from  a  variable  x  to  a  variable  y  under  procedure  P 
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provided  a  change  in  the  value  of  x  can  be  conveyed  to  y  when  P  is  executed.  For 
example,  the  binding  of  an  actual  parameter  a  to  a  formal  parameter  x  causes  a 
flow  from  a  to  x. 

This  concept  can  be  formalized  as  follows.  Let  store:  id  — *■  out  model  a  computer 
memory  as  a  mapping  of  identifiers  to  their  values,  and  let  eval:proc  x  store  — ► 
store  be  the  valuation  function  for  procedures.  The  infix  operation  .  is  a  function 
store  x  id  — ►  out  that  looks  up  the  value  of  an  identifier  in  a  store.  The  equality 
predicate  Si  =  S2  determines  whether  or  not  two  stores  Si  and  S2  have  the  same 
values  for  all  identifiers  except  possibly  for  x.  Information  is  transmitted  from  a  to 
b  by  procedure  P  if  and  only  if  variety  in  a  affects  the  value  of  b  when  P  is  executed. 
Formally, 


a  =£>  b  =f  (3si,  s2)  si  =  s2  A  eval(P,si).b  ^  eval(P,S2).b 

where  Si,  $2  in  store.  This  formulation  of  information  transmission  is  a  slight  mod¬ 
ification  of  the  formulation  originally  developed  by  Cohen  [2]  for  stating  problems 
in  computer  security. 

For  our  purposes,  it  is  not  enough  to  know  that  there  is  flow  between  two  vari¬ 
ables.  In  addition,  we  must  know  the  “directionality”  of  the  flow.  Specifically,  we 
define  ==>  to  be  the  logical  disjunction  of  three  directed-flow  relations:  =>/,  =>j, 
and  =>/,  standing  for  forward,  backward,  and  lateral  information  flow,  respectively. 
Forward  and  backward  flows  model  the  interprocedural  variable  bindings  that  result 
from  a  direct  or  transitive  procedure  call.  Lateral  flow  is  intraprocedural,  involving 
local  variables  of  the  same  procedure.  Henceforth,  x  =>  y  is  taken  to  mean  that 
(x,  y)  is  in  any  one  of  the  three  directed-flow  relations.  Formally,  the  relation 

=>:  proc  x  war  x  war  — ►  bool 

is  defined  by 

P,  def  p  w  p  P 

x  =►  y  =  x  =>,  y  W  x  =>•  f  y  V  x  =>6  y 

p 

Notice  that  for  a  =>  b  to  be  true,  neither  a  nor  b  can  be  a  constant.  For 

p 

example,  3  =>  b  cannot  be  true  since  3  contains  no  variety.  Recall,  however, 
that  we  decided  to  model  a  constant  as  a  variable.  More  specifically,  we  model  a 
constant  as  a  read-only  variable,  i.e.,  a  variable  whose  value  cannot  be  changed  by 
the  program.  In  this  model  of  constants,  a  question  about  the  effects  of  an  edit 
that  would  replace  one  constant  with  another  is  meaningful  and  can  be  answered 
without  any  additional  machinery.  A  read-only  variable  must  satisfy  the  derived 
Readonly  predicate,  which  disallows  flow  to  the  variable,  but  allows  flow  to  emanate 
from  it. 
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Figure  4:  Directed  information  flows  for  the  Addlnc  example  in  Figure  3a. 


Example  3  The  directionality  of  the  flows  seen  earlier  in  Figure  3b  is  made  explicit 
in  Figure  4.  For  example,  there  is  a  forward  flow  from  sumin  to  ain  because  the  call 
from  Addlnc  to  Add  causes  the  initial  value  of  actual  parameter  sum  to  be  bound  to 
formal  parameter  a  of  Add.  There  is  a  backward  flow  from  aout  to  sum0Ut  because 
the  value  of  a  is  assigned  to  the  output  version  of  sum  upon  return  of  control  from 
Add  to  Inc.  □ 


4.4  Additional  Primitive  Connections 

The  n-ary  callByVR  relation  is  used  to  model  procedure  calls.  Its  first  argument  is 
a  procedure  P  that  directly  calls  a  procedure  Q  with  an  arbitrary  number  of  actual- 
formal  parameter  pairs.  Each  call  has  a  value-result  semantics  and  call  chains  can 
be  circular.  Sometimes  we  are  not  interested  in  the  arguments  of  a  call,  in  which 
case  we  use  the  function 


dcall:  proc  x  proc  — ►  bool 


which  is  defined  by 


dcall(P,<9)  =  (3p:p/*st)callByVR(P,  Q,p) 

where  plist  is  a  set  of  pairs  of  type  war  x  var,  which  is  the  possible  actual-formal 
pairings  in  a  given  context. 

The  mod  and  acc  predicates  are  familiar  in  the  field  of  program  optimization  [1]; 
they  are  also  useful  in  building  structural  specifications.  The  relation  mod(P,x)  says 
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that  a  variable  x  of  procedure  P  (i.e.,  x  £  varOf  (P))  can  be  modified  by  execution  of 
P,  either  directly  or  transitively  through  a  called  procedure.5  The  relation  acc(P,x) 
says  that  a  variable  x  can  be  accessed  by  execution  of  procedure  P.  The  mod  and 
acc  relations  are  not  independent  of  the  other  primitive  concepts;  we  will  show  how 
they  can  be  defined  in  terms  of  flow  relations. 

4.5  Derived  Abstractions 

Ordinarily,  a  system  design  would  not  be  specified  directly  in  terms  of  the  primi¬ 
tives.  It  instead  would  be  expressed  in  terms  of  abstractions  appropriate  to  each 
level  of  detail.  Most  abstract  objects  are  represented  naturally  as  primitive  proce¬ 
dures  or  variables  which  are  subsequently  “implemented”  in  terms  of  one  or  more 
similar  objects.  On  the  other  hand,  most  abstract  dependencies  are  best  represented 
as  derived  concepts  defined  in  terms  of  more  primitive  dependencies.  Abstract  de¬ 
pendencies  are  used  to  partition  a  system  into  manageable  parts  that  interact  in 
well-defined  and  predictable  ways.  Several  useful  derived  dependencies  are  defined 
below.6 

Example  4  Protecting  a  variable.  It  is  often  useful  to  restrict  access  to  a  variable 
or  to  restrict  the  ways  in  which  a  variable  can  be  used.  For  instance,  we  may  want 
to  allow  procedures  to  read  a  certain  variable  but  prohibit  them  from  writing  it. 
This  is  captured  by  the  predicate 

Readonly:  var  — ►  bool 


which  is  defined  by 


ReadOnly(i)  =f  ->  (3p:proc)mod(p,x) 

for  x  in  var.  If  a  variable  is  required  to  satisfy  this  predicate,  we  can  specify 
accesses  of  the  variable,  but  any  specified  modification  to  it  will  be  inconsistent 
with  the  above  definition.  □ 

Example  5  Restricting  variable  interactions.  A  set  of  variables  can  be  partitioned 
into  independent  subsets  using  a  predicate  which  says  that  a  variable  x  is  completely 

5For  optimization  purposes,  mod  and  acc  usually  contain  only  variables  visible  at  the  interface 
to  a  procedure.  However,  we  must  also  include  local  variables  not  visible  at  the  interface. 

6  Design  dependencies  can  be  stated  informally  using  various  program  design  languages,  several 
of  which  are  described  in  a  book  by  Martin  and  McClure  [9].  These  languages  provide  a  few  useful 
primitive  concepts,  but  they  do  not  support  definitional  extensions  and  their  meaning  is  imprecise 
and  possibly  ambiguous. 
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independent  of  a  variable  y  if  and  only  if  a  change  in  the  value  of  y  has  no  effect  on 
the  value  of  x.  This  predicate 


Independent  Of :  var  x  var  — ►  bool 
is  defined,  for  x  and  y  in  var,  by 
IndependentOf(x,  y)  d= 

(Vx,  y:  war)  (Vi2:  proc)  [versionOf(x,  x)  A  versionOf(y,  y)  D  ->(y  =>  x)] 

If  a  variable  x  is  independent  of  a  variable  y,  we  know  that  y  cannot  use  x  as  an 
intermediary  to  affect  some  other  variable  or  procedure.  □ 

Example  6  Interprocedural  channel.  Suppose  that  we  want  two  procedures  to 
communicate  through  a  specific  vaxiable.  We  say  that  a  variable  x  is  a  channel 
from  procedure  P  to  procedure  Q  iff  information  flows  from  P  to  Q  through  x. 
This  is  captured  by 


ChannelTo:  proc  x  proc  x  var  — ►  bool 


which  is  defined  by 

ChannelTo  (P,Q,x)  = 

(3x,  y ,  z:  war)  (3 R\  proc)  [versionOf  (x,  x)  A  varOf  (y,  P)  A  varOf  (z,  Q)  A 
((y  ==>/  x  A  x  z)  V  (y  X  A  x  =^>f  z)  V  (y  x  A  x  2))] 

for  P  and  Q  in  proc  and  x  in  var.  Since  x  is  an  interprocedural  channel,  we  need  not 
consider  lateral  flows  whose  purpose  is  to  link  interprocedural  flows.  We  also  rule 
out  the  possibility  of  a  forward-backward  flow,  since  this  would  make  x  a  channel 
from  P  to  itself.  □ 

Example  7  Interprocedural  partitioning.  Assume  that  a  procedure  A  is  not  in¬ 
tended  to  be  connected  to  a  f  rocedure  B,  which  we  express  by 

-'ConnectedTo(A,  B) 

The  ConnectedTo  relation  says  that,  for  any  procedures  P  and  Q ,  there  is  a  tran¬ 
sitive  call  from  P  to  Q,  or  a  transitive  information  flow  from  a  variable  referenced 
by  P  to  one  referenced  by  Q,  or  both.  The  predicate 

Calls:  proc  x  proc  — »  bool 


15 


is  defined  recursively  by 


Calls  (P,  Q)  = 

(3p:  plist)  [callByVR(P,  Q,p)  V 

(3P:  proc)[callBy VR(P,  R,  p)  A  Calls (P,<?)]] 

where,  as  before,  plist  is  a  set  of  possible  actual-formal  pairings.  The  predicate 

ConnectedTo:  proc  x  proc  — >  bool 


is  defined  by 
ConnectedTo(P,  Q) 

Calls(P,  Q)  V  (3x,  y:  war) (3P:  proc)  [varOf  (x,  P)  A  varOf(y,  Q)  A  x  yj 

Notice  that  information  may  flow  from  P  to  Q  as  the  result  of  a  transitive  call  from 
P  to  Q  (in  which  case  R  is  P),  or  R  can  be  a  parent  of  P  and  Q  that  transmits  a 
return  flow  from  P  to  Q.  □ 

5  Inferring  Directed  Flows 

We  must  identify  any  implicit  flows  in  a  specification  before  we  can  apply  our 
transitivity  axioms.  The  axioms  presented  in  this  section  can  be  used  to  deduce  the 
flows  left  implicit  in  any  specification  constructed  using  the  primitives. 

The  first  two  axioms  in  Figure  5  allow  us  to  infer  directed  flows  from  calls.  The 
VP  axiom  handles  the  situation  in  which  the  value  of  an  actual  parameter  is  actually 
used  by  the  called  procedure.  In  this  event,  there  is  a  forward  flow  from  the  actual 
parameter  to  the  corresponding  formal  parameter.  In  the  antecedent  of  the  axiom, 
af  is  an  actual-formal  pair  and  afpairs  (of  type  plist)  is  a  set  of  such  pairs.  The 
member  operation  tests  whether  af  is  in  afpairs.  In  the  consequent,  the  value  of  the 
expression  first(af)  is  a  version  variable  transmitted  as  an  actual  parameter;  the 
value  of  first(versions(last(af)))  is  the  input  version  variable  for  the  corresponding 
formal  parameter. 

The  RP  axiom  says  that  a  backward  flow  from  a  formal  to  its  corresponding 
actual  occurs  only  when  the  value  of  the  formal  is  modified  during  execution.  If 
it  is  not,  there  is  no  need  to  return  its  value  and,  hence,  no  backward  flow  is 
necessary.  The  consequent  of  this  axiom  specifies  a  backward  flow  from  the  output 
version  variable  associated  with  the  formed  parameter  in  the  pair  af  to  the  next 
version  of  the  variable  transmitted  as  an  actual  parameter. 

Normally,  there  is  a  bidirectionial  flow  between  actuals  and  formals.  The  use  of 
two  separate  axioms,  however,  will  improve  our  estimate  of  the  effects  of  a  change 
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Value  Parameter  (VP) 

callByVR(P,  Q,  afpairs )  A  member(a/,  afpairs)  A  acc(Q,  last(a/)) 
D  first(a/)  first(versions(last(a/))) 

Result  Parameter  (RP) 

callBy  VR(P,  Q,  afpairs)  A  member  (a f ,  afpairs)  A  mod(Q,last(a/)) 
D  last  (versions  (last  (a/))  next  (first  (a/)) 

Mod  and  Information  Flow  (MI) 

mod(P,  z)  = 

(3z:  war)[versionOf(z,  z)  A 

(3y:  war)  [(varOf  (y,P)  A  y  =^>i  x)  V 

(3Q:  proc)  [dcall(P,  Q)  A  varOf  (y,  Q)  A  y  z]]] 

Acc  and  Information  Flow  (AI) 

acc  (P,z)  = 

(3z:  war)[versionOf(z,z)  A 

(3y:  war)  [(varOf  (y,P)  A  z  y)  V 

(3Q:proc)[dcall(P,<5)  A  varOf  (y,Q)  A  x  y]]j 


Figure  5:  Finding  implicit  flows.  The  type  of  each  free  variable  can  be  inferred  from 
the  signatures  in  Figure  2. 
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whenever  the  flow  happens  to  be  unidirectional.  If  a  formal  parameter  is  not  mod¬ 
ified  by  the  called  procedure,  there  is  no  return  flow.  If  a  formal  parameter  is  used 
only  to  return  values,  there  is  no  forward  flow.  If  a  formal  parameter  is  not  used  at 
all,  no  flow  occurs  and  none  can  be  inferred  using  the  axioms. 

The  axioms  defining  mod  and  acc  are  intuitively  simple  but  syntactically  com¬ 
plex.  Axiom  MI  says  that  a  variable  x  is  modified  by  P  if  and  only  if  a  version  of  x 
is  modified  within  P  or  as  a  result  of  a  return  flow  from  a  called  procedure.  Axiom 
AI  says  that  x  is  accessed  by  P  if  and  only  if  a  version  of  x  is  accessed  within  P  or 
is  transmitted  by  P  as  an  actual  parameter.  Variable  modification  and  access  are 

p 

determined  by  directed  flows:  if  a  =>  6,  then  a  is  accessed  and  6  is  modified. 


6  Transitivity  Using  Directed  Flows 

To  accurately  track  the  flow  of  information,  we  introduce  twelve  logical  axioms 
in  Figure  6  that  define  transitivity  for  the  information  flow  relation.  The  axiom 
schema  at  the  top  of  the  figure  says  directional  flows  axe  transitive  in  the  usual 
sense. 

The  next  four  axioms  combine  directional  flows  with  lateral  flows.  Lateral  flows 
among  variables  of  a  procedure  are  “directionless”  in  that  they  axe  used  to  link 
interprocedural  flows  and  to  allow  interprocedural  flow  propagation  to  proceed  in 
either  direction.  For  instance,  Axiom  FL  says  that  if  a  forward  flow  is  followed  by  a 
lateral  flow,  the  overall  direction  of  flow  is  forward.  The  direction  will  stay  forward 
unless  it  is  changed  by  a  backward  flow.  Similarly,  axiom  LF  says  that  a  lateral  flow 
followed  by  a  forward  flow  results  in  a  forward  flow.  In  both  instances,  the  lateral 
flow  serves  as  an  intermediate  flow  connecting  to  a  propagated  forward  flow. 

The  BF  and  FB  axioms  combine  forward  and  backward  flows.  Axiom  BF  says 
that  if  there  is  a  backward  flow  from  a  variable  x  in  a  called  procedure  to  a  variable 
y  in  its  caller,  and  the  caller  then  transmits  y  forward  to  variable  z  through  another 
call,  the  resultant  direction  of  flow  from  x  to  z  is  forward. 

Axiom  FB  is  somewhat  complicated  because  it  must  trace  a  flow  emanating 
from  a  call  site  to  the  called  procedure  and  back  to  the  same  call  site.  This  is 
accomplished  by  the  third  conjunct  in  the  antecedent.  This  conjunct  contains  two 
disjuncts,  which  handle  the  two  situations  illustrated  in  Figure  7.  In  both  instances, 
procedure  P  calls  procedure  Q  and  variable  y  belongs  to  Q.  The  first  disjunct 
(Figure  7a)  says  that  when  the  value  of  actual  x  is  transmitted  to  formal  a,  Q  may 
modify  the  value  of  a  and  then  transmit  this  new  value  back  to  x.  If  this  occurs, 
the  next  version  of  x,  next(x),  is  assigned  the  value.  The  input  value  of  a,n  can  be 
transmitted  directly  or  indirectly  to  aout. 

The  second  disjunct  (Figure  7b)  handles  the  case  in  which  the  value  of  x  affects 
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Lateral-Lateral  (LL),  Forward- Forward  (FF),  Backward-Backward  (BB) 

P  P  P 

x  =>s  y  A  y  =>s  z  D  x  =>s  z ,  if  6  is  /,/,  or  b. 

Forward-Lateral  (FL) 

p,  p.  p. 

x  => f  y  A  y  =>i  z  D  x  ==> ;  z 

Lateral-Forward  (LF) 

p  p,  p 

I  =>,  y  A  y  =>f  z  D  x  =>/  z 

Backward-Lateral  (BL) 

p,  .  p  _  p 

x  =>4  y  A  y  =>j  z  D  x  =>b  z 

Lateral-Backward  (LB) 

p  p  p 

x  =>j  y  A  y  =>6  z  D  x  =>(,  z 

Backward-Forward  (BF) 

i^j!/  A  yJ=>fz  D  x^>fz 
Forward-Backward  (FB) 

x  ==>/y  A  y^sz  A  (3Q:proc)(3a,6:var)[  varOf(y,Q)  A 

[next(x)  =  zA  callByVR(P,  Q,  (x,  a))  A  first(versionsOf(a))  ==>(  y 
A  (y  ==>j  last(versionsOf(a))  V  y  =  last(versionsOf(a)))]  V 
[sameCall(P,  Q,  {(x,  a) ,  (prev(z) ,  6)})  A  first (versionsOf  (a) )  ==>j  y 
A  (y  =>/  last  (versionsOf  (6))  V  y  =  last  (versionsOf  (6)))]] 

p 

D  x  ^  j  2 

Upward  flattening  (UF) 

dcall(P, Q)  Ax  =>s  y  D  x  y,  if  6  is  /, 6, or  l. 

Figure  6:  Transitivity  of  =>.  Free  variables  of  type  war  (version-variable)  are 

indicated  by  a  small  letter,  of  type  proc  (abstract  procedure)  by  a  capital  letter. 
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x  next(x)=z  x  z 


f 

'  1 

/ 

t*  1 

f 

'  1 

/ 

\b 

— K 

a  in 

-►  y 

a  out 

3  in 

y 

b  out 

(a)  (b) 


Figure  7:  Illustration  of  the  two  situations  handled  by  axiom  FB. 

the  value  returned  to  another  actual  parameter.  The  derived  predicate  sameCall  is 
true  if  two  actual-formal  parameter  pairs  are  associated  with  the  same  call  site.  In 
axiom  FB, 

sameCall(P,  Q,  {(x, a),  (prev(,z),  6}}) 

is  true  if  actual  x  of  P  is  associated  with  formal  a  of  Q  and  the  previous  version  of 
z  (the  one  before  the  call)  is  associated  with  actual  6.  That  is, 

sameCall:  proc  x  proc  x  plist  — ►  bool 


is  defined  by 

sameCall(P,  Q,Pi)  =  (3p2:  plist) [callByVR(P,  Q,p2)  A 

(Vp:ppair)(member(p,pi)  D  member(p,p2))J 

where  ppair  is  of  type  war  x  var,  an  actual-formal  pair. 

If  the  premise  of  axiom  FB  is  satisfied,  there  is  a  lateral  flow  from  actual  x  to 
actual  z.  This  has  the  effect  of  masking  the  procedure  call,  and  it  permits  x  to  be 
propagated  in  a  lateral,  forward,  or  backward  flow  initiated  by  P.7 

The  UF  axiom  schema,  in  conjunction  with  the  FB  axiom,  specifies  when  in¬ 
terprocedural  flows  can  legitimately  be  combined.  The  UF  schema  allows  flows  to 
be  combined  only  if  they  occur  on  control  paths  emanating  from  a  common  pro¬ 
cedure.  More  precisely,  the  schema  says  that  a  flow  resulting  from  the  execution 
of  procedure  Q  also  results  from  the  execution  of  procedure  P  provided  P  directly 
calls  Q. 

7  Axiom  FB  can  be  stated  more  elegantly;  the  formulation  presented  in  this  section  was  chosen 
to  mirror  the  structure  of  the  other  axioms  as  closely  as  possible. 
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Example  8  Returning  to  Figure  3a,  suppose  that  we  want  to  ask  “Does  the  value 
of  i  affect  the  value  of  sum?”  To  answer  this  question,  we  must  collect  the  declared 
objects  into  a  context,  which  was  done  earlier  in  Example  2,  and  translate  the 
specified  connections  into  predicates  in  our  logic.  The  translation  gives 
Pi.  callByVR(Add/nc,  Add,  ( sumin,a),(iin,b )) 

P2.  callByVR [Addlnc,  Inc,  (itn,  z )) 

P3.  din  aout 
P4.  bin  =$-1  a0ut 

P5.  callByVR(/nc,  Add,  ( z,n ,  a),  (1<„,  b )) 

where  the  call  and  affects  relations  in  the  figure  have  been  translated  into  the 
callByVR  and  =>i  relations,  respectively.  We  now  use  the  axioms  in  Figure  5  to 
deduce  the  following  implicit  flows;  together  with  P3  and  P4,  they  correspond  to 
the  ten  arrows  in  Figure  4. 

n  -  Addlnc 

P6.  SUmin  =>  /  din 

Addlnc 

P7 .  aout  —  b  sum0Ut 

•  Addlnc  , 

P8.  tin  =>  /  bin 

P9.  ti„  =>  f  Zin 

P10.  Zout  h  iout 
Pll.  Zin  ~-—^J  °nn 

P12.  aout  - ^4  2 out 

P13.  l,n  bin 

Given  assumptions  P1-P13,  the  answer  to  our  question  is  provided  by  the  following 
formal  proof. 


1.  dcall(  Addlnc,  Add)  premise  PI,  defn.  of  dcall 

2.  binA^cia0Ut  UF(1,P4) 

3.  itn  A^f  aout  FL(P8,2) 

4.  sameCall (Addlnc,  Add,  premise  Pi,  defn.  of  sameCall 

{{iin,b),{sumin,a)}) 

5.  iinA^Cisumoat  FB(3,  P7,4) 

The  application  of  the  FB  axiom  in  the  last  step  is  for  the  situation  depicted  in 
Figure  7b.  □ 


Example  9  A  slightly  more  difficult  question  is  “Does  the  input  value  of  i  affect  its 
output  value?”  The  following  proof  illustrates  how  the  lateral  flow  in  the  consequent 
of  the  FB  axiom  provides  a  neutral  platform  for  propagating  directional  flows.  Both 
applications  of  the  FB  axiom  are  for  the  situation  in  Figure  7a. 
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1. 

dcall{Inc,  Add) 

P5,  defn.  of  dcall 

2. 

In f 

ain  —t'I  Gout 

UF(1,P3) 

3. 

Inf 

Zin  — }  O'out 

FL(P11, 2) 

4. 

sameCall( Inc,  Add,  {(zin,  a)}) 

P5,  defn.  of  sameCall 

5. 

Inf 

Zin  ~  ’'l  Zout 

FB(3,  P12, 4) 

6. 

dealt (Addlnc,  Inc ) 

P2,  defn.  of  dcall 

7. 

Addlnc 

Zin  ■**  l  Zotit 

UF(5,6) 

8. 

Addlnc 

bn  ■  /  Zout 

FL(P9,7) 

9. 

sameCall(Add/nc,  Inc,  {(i,n,  2)}) 

P2,  defn.  of  sameCall 

10. 

Addlnc 

*«n  1 2 3 4  "  r  l  lout 

FB(8,  P10,9) 

□ 

7  Questions,  Answers,  and  the  Logic 

The  special  flow  axioms,  structural  specifications,  and  questions  about  changes  are 
all  represented  in  a  single  logic.  Let  DDB  denote  a  “design  data  base”  consisting 
of  finitely  many  formulas  that  include  a  structural  specification  S ,  the  transitivity 
axioms  T,  and  the  rules  1  for  inferring  implicit  flows,  all  expressed  in  or  translated 
into  the  language  C(DDB).  C{DDB)  is  a  typed  (many-sorted)  first-order  logic 
with  equality  having  the  following  properties: 

1.  There  are  a  finite  number  of  constant  signs.  The  constants  are  pairwise  dis¬ 
tinct,  and  each  one  denotes  a  different  design  object,  such  as  Adeline  or  sum. 

2.  The  predicate  signs  axe  =>/  ,  =>«,,  =>i,  mod,  acc,  and  callByVR. 

3.  The  type  symbols  are  var,  war,  proc,  bool,  seq,  and  set. 

4.  The  well-formed  formulas  are  definite  Horn  clauses  of  the  form 

Hi  A  •  •  •  A  Hn  D  C,  n  >  0 

where  the  Hi  and  C  axe  atoms  containing  no  function  symbols.8 

Definitions  introduce  new,  eliminable  symbols  and  we  regard  them  as  additional 
axioms.  A  query  Q  consists  of 

1.  A  declaration  of  the  form  Xit  t\, . . . ,  x„:  tn,  and 

8The  functions  used  in  this  paper  are  total  and  we  know  the  values  for  any  of  their  arguments. 
There  are  no  Skolem  functions  because  wffs  are  quantifier  free. 


2.  An  expression  of  the  form 

{qm- Ti) ...  ( qmym ■  Tm)  W  (xl5 . . . ,  xn,  ylt . . . ,  ym) 

where  (gtyt: Ti)  is  (Vyt:T',)  or  (3y,:T,),  the  t's  and  T” s  are  types  in  Z(DDB)  , 
and  W{x\, . . . ,  xn,  yi, . . . ,  ym)  is  a  quantifier-free  formula  in  t(DDB)  having 
free  variables  Xj, . . . ,  xn  and  bound  variables  yi, . . . ,  ym. 

Before  defining  what  it  means  for  an  n-tuple  of  constants  to  be  an  answer  to  a 
query,  we  should  point  out  that  it  is  not  possible  to  deduce  negative  information 
with  the  inference  system  defined  above.  For  instance,  in  our  earlier  analysis  of  the 
specification  in  Figure  3a,  we  concluded  informally  that  ->( sum  ==>  z ).  Intuitively, 
this  is  the  correct  answer.  However,  it  is  not  possible  to  infer  ->(sum  ==>  z )  from  a 
DDB  containing  that  specification. 

Implicit  in  our  informal  reasoning  was  the  assumption  that  all  intended  flows 
were  specified  and  that  those  that  were  not  specified  could  not  occur.  This  assump¬ 
tion  must  be  removed  or  it  must  be  taken  into  account  in  the  inference  system. 
The  former  approach  requires  that  all  relevant  positive  and  negative  facts  about 
the  system  be  stated  explicitly  in  the  specification.  In  our  domain,  the  number  of 
negative  facts  can  far  exceed  the  number  of  positive  ones,  making  it  impractical  to 
include  the  negative  facts  in  a  specification.  The  alternative  approach  is  to  specify 
all  positive  facts  explicitly  and  modify  a  traditional  first-order  inference  system  to 
infer  negative  facts  by  default.9  This  can  be  formalized  in  terms  of  Reiter’s  closed- 
world  assumption  (CWA)  [13],  which  says  that  given  a  data  base  DB  and  an  atom 
A ,  if  DB  \f  A ,  then  we  can  infer  ->A.  Formally,  the  CWA  closure  of  a  DB  is  defined 
by 

closure(DH)  =  DB  U  {-P(c)|  DB  I /  P{c)} 

where  each  P(c)  is  a  ground  atomic  formula.  The  CWA  closure  is  known  to  be 
consistent  for  definite  Horn  clauses  [16],  and  any  positive  or  negative  atomic  query 
can  be  evaluated  with  respect  to  it. 

An  answer  to  a  query  can  now  be  defined  as  follows.  An  n-tuple  of  constants 
ci, . . .  ,cn  is  an  answer  to  a  query  Q  with  respect  to  DDB  iff 

1.  ci  €  *i,...,c„  €  tn,  and 

2.  closure(DDB)  b  (q^:  2\) . . .  [qmym:  Tm)  W{cu . . .  ,cn,yu. . .  ,ym) 

It  is  clearly  decidable  whether  or  not  an  n-tuple  of  constants  is  an  answer  to  a 
query,  since  there  are  only  finitely  many  constants  and  predicates  in  the  intended 

9Negative  facts  may  be  included  in  a  structural  specification  for  expository  purposes,  but  they 
can  be  removed  from  the  DDB  because  they  have  no  influence  on  CWA  query  evaluation. 
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interpretation.  However,  if  a  design  contains  a  large  number  of  objects,  it  may 
be  impractical  to  evaluate  a  query  using  a  brute-force  approach.  The  major  cause 
of  inefficiency  is  the  recursive  nature  of  the  information-flow  axioms.  One  way  to 
eliminate  the  possibility  of  infinite  deductions  is  to  represent  certain  information 
flow  relationships  explicitly  as  ground  atoms,  rather  than  intentionally  as  general 
facts  [14 j.  Bui  this  can  require  an  excessive  amount  of  storage  if  ground  atoms  are 
stored  in  the  obvious  way,  e.g.,  as  a  set  or  array  of  atoms.  An  open  problem  is 
the  development  of  a  time  and  space  efficient  decision  procedure  for  computing  the 
transitive  closure  of  =>  in  accordance  with  our  axioms. 

The  Mowing  examples  illustrate  how  to  represent  questions  about  variables  and 
procedures  in  our  logic. 

Example  10  Variables.  Suppose  we  are  interested  in  those  variables  x  that  are 
affected  by  a  change  to  a  given  variable  v  of  procedure  P.  Formally,  we  can  express 
this  by 


(3x,  0:  war)  (3 R:  proc) 

R 

[versionOf(x,  x)  A  varOf(u,  P)  A  versionOf(f),  v)  A  v  =>•  xj 

The  variable  x  is  the  only  free  variable;  v  and  P  are  logical  constants.  The  formula 
says  that  a  variable  x  is  affected  by  a  change  to  a  variable  v  (of  P)  if  a  change  to  a 
version  of  v  can  affect  a  version  of  x. 

Our  earlier  question  about  whether  sum  affects  z  is  an  instance  of  this  question. 
If  we  substitute  sum  for  v,  Addlnc  for  P,  and  z  for  x,  we  obtain 

(3x,  v :  war)  (312:  proc) 

jversionOf(x,  z)  A  varOf(sum,  Addlnc )  A  versionOf(u.  sum)  A  v  ==>  xj 

which  is  a  formal  statement  of  the  question.  This  formula  is  not  entailed  by  the 
closure  of  the  DDB  containing  the  specification  in  Figure  3.  Therefore,  by  the 
CWA,  we  can  conclude  that  sum  does  not  affect  z.  □ 

Example  11  Procedures.  Questions  about  procedures  can  be  reduced  to  questions 
about  variables.  For  instance,  if  we  want  to  know  each  procedure  Q  affected  by  a 
change  to  a  given  variable  v  of  procedure  P,  we  write 

(3x,{):  war)(3!2:  proc) 

[varOf(x,  Q)  A  varOf(u,  P)  A  versionOf(i),  v)  A  v  x] 

This  expression  says  that  a  procedure  Q  is  affected  by  a  change  to  v  if  it  has  a 
variable  that  is  affected  by  a  change  to  v. 
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Similarly,  the  question  of  which  variables  x  can  be  affected  by  a  change  to  given 
procedure  P  is  expressed  by 

(3x,  v:  war)  (3/2:  proc)[versionOf(x,  x)  A  varOf(u,  P)  Ai  =>  x] 

and  the  question  of  which  procedures  Q  affected  by  a  change  to  a  given  procedure 
P  by 

(3x,  ti:war)(3P:proc)[var0f(x,Q)  AvarOf(t),P)  A  v  x] 


□ 

Example  12  Abstract  slice.  An  abstract  slice  is  those  variables  x  that  can  affect 
a  given  variable  v  of  procedure  P,  which  is  the  converse  of  the  previous  questions. 
This  concept  can  be  expressed  as 

(3x,  i):  wax)  (3/2:  proc) 

[versionOf  (x,  x)  A  varOf  (u,  P)  A  versionOf(t),  v)  A  x  =*>  u] 

The  usual  notion  of  a  slice  is  concerned  with  the  individual  statements  where  vari¬ 
ables  are  affected.  We  have  chosen  procedures,  not  statements,  as  atomic  objects  to 
facilitate  the  design  and  debugging  of  large-scale  systems.  Statement-level  objects 
are  more  suited  to  program  merging,  for  example,  which  is  an  important  application 
of  slicing  [6].  □ 

The  logical  system  and  the  question-answering  technique  defined  in  this  section 
have  been  implemented  in  a  version  of  Prolog  that  employs  the  negation  as  failure 
inference  rule  to  infer  negative  information.  A  Prolog  program  is  a  set  of  definite 
Horn  clauses  that  are  executed  using  a  refinement  of  the  resolution  principle  called 
SLD  resolution.  A  branch  in  an  SLD  proof  tree  is  called  a  success  branch  if  the 
derivation  of  the  goal  succeeds  and  a  failure  branch  if  it  fails.  A  finitely  failed 
SLD  tree  is  one  which  is  finite  and  contains  no  success  branches.  The  negation  as 
failure  rule  says  that  if  an  atom  A  has  a  finitely  failed  SLD  tree  for  a  gi/en  DDB, 
then  infer  -A  from  that  DDB.  Since  the  SLD  finite  failure  set  is  a  subset  of  the 
complement  of  the  success  set,  the  negation  as  failure  rule  is  less  powerful  that  the 
CWA.  Nevertheless,  it  is  used  for  inferring  negative  information  because  it  is  easily 
and  efficiently  implemented.  Details  on  SLD  resolution,  negation  as  failure,  etc.  can 
be  found  in  a  book  by  Lloyd  [8]. 
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8  Conclusion 


We  have  presented  a  general  logical  technique  for  isolating  the  semantic  effects  of 
changes  to  a  software  system.  The  technique  applies  to  structural  designs  con¬ 
taining  predicates  built  up  from  our  primitives  and  to  implementations  having  a 
classical  data  flow  semantics.  The  technique  improves  upon  a  straightforward  in¬ 
formation  flow  analysis  by  decomposing  the  usual  information  flow  relation  into 
three  finer-grain  relations,  called  directed  flow  relations,  and  by  defining  transitiv¬ 
ity  of  information  flow  axiomatically  in  terms  of  the  three  relations.  The  definition 
of  transitivity  involves  several  mutually  recursive  axioms. 

It  is  undecidable  in  general  to  determine  the  semantic  effects  of  a  change.  Con¬ 
sequently,  we  relaxed  the  requirement  that  the  results  of  our  analysis  be  exact 
and  insisted  only  that  the  results  be  reasonably  close  to  exact  and  conservative. 
By  relaxing  the  exactness  constraint,  we  were  able  to  use  structural  proofs  to  ap¬ 
proximate  the  true  semantic  effects  of  a  change.  This  led  to  a  decision  procedure 
for  approximating  the  effects  of  changes,  which  we  believe  is  an  important  step  in 
making  a  formal  analysis  of  changes  practical. 

However,  the  direct  implementation  of  our  technique  in  Prolog  proved  inefficient 
for  systems  containing  a  large  number  of  objects.  Further  research  is  needed  to 
develop  a  fast  algorithm  for  computing  the  transitive  closure  of  the  information 
flow  relation  from  our  transitivity  axioms. 
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